Methods for Compression of Multivariate Correlated Data for Multi-Channel Communication

ABSTRACT

Methods are provided for efficiently encoding and decoding multivariate correlated data sequences for transmission over multiple channels of a network. The methods include transforming data vectors from correlated sources into vectors that comprise substantially independent and correlated components, and generating a common information vector based on the correlated components, and two private information vectors. The methods also include computing the amount of information, such as Wyner&#39;s lossy common information, in the common information vector, computing rates that lie on the Gray-Wyner rate region, and choosing compression rates based on the amount of common information. The methods may be applicable, in general, to a wide range of communications and/or storage systems and, particularly, to sensor networks and multi-user virtual environments for gaming and other applications.

FIELD OF THE DISCLOSURE

The disclosure generally relates to methods of compressively encodingmultiple correlated data vectors for efficient transmission over aplurality of network channels, and to general information transmissionover networks. The methods are also applicable to informationtransmission over noisy channels, such as multiple access channels,interference channels, and/or broadcast channels, that use private andcommon information.

BACKGROUND

The main goal of telecommunication and information systems is toreproduce sent or stored information reliably, either approximately orwith required fidelity, and with efficient use of system resources. Theinformation is represented by signals and system resources may includethe energy that is put into a signal, as well as the bandwidth orstorage space that is used by the signal. Converting information intodigital form, or bits, allows flexible application of many computing andsignal processing techniques to make the systems as efficient aspossible, approaching mathematically derived theoretical limits in someapplications. Coding techniques have been developed to allow anefficient transfer of digital information by removing redundancy in asource of information and then, if necessary, adding encoding redundancy(e.g., a checksum) to enable detection or correction of occasional biterrors. A variety of modulation techniques can be used in systems toconvert encoded digital information into continuous signals that can becommunicated over a variety of physical channels using radio waves,optical techniques, magnetic domains, etc. On the receiving end,demodulation, followed by decoding recovers the information.

Current telecommunication and information technology systems aredesigned based on Shannon's operational definitions of coding capacity,which define the maximum rate for reliable communication by channelencoders and decoders over noisy channels, and coding compression, whichdefines the minimum compression rate of source encoders and decoders.Generally, channel encoders and decoders are utilized to combatcommunication noise, and source encoders and decoders are utilized toremove redundancy in data. In a typical wired or wireless datatransmission system, a source encoder is used to efficiently compressthe data to be transmitted. Subsequently, a channel encoder receives thecompressed data and adds redundancy to protect the data against thenoise of the channel. The receiver located at the other end of thechannel, upon receiving the encoded data, applies the inverse operationsthat consist of a channel decoder (inverse of channel encoder) and asource decoder (inverse of source encoder). The original data can bemade available to the user with an arbitrarily small probability oferror or loss. To use the channel resources efficiently, however, thegoal of so-called lossy coding techniques can reconstruct a closeapproximation of the original data, subject to a distortion or fidelitycriterion. Lossy coding techniques have enabled gains in the capacitiesof point-to-point or broadcasting systems that routinely deliver, forexample, high quality digital audio and video to users.

When encoding and reproducing information from correlated sources, thereis a need for techniques that enable efficient communication ofinformation from the multiple correlated sources and over multiplechannels. Such techniques are applicable in many technical fields, suchas, for example, the internet of things (IoT) and/or sensor networks,multiplayer games, and virtual collaborative environments that generatemultiple correlated perspectives on computer-generated data.Accordingly, there is a need for efficiently encoding and communicatinginformation from correlated sources over multi-channel networks.

SUMMARY

A computer-implemented method is used to compressively encode twocorrelated data vectors. In one aspect, the method comprises obtaining afirst data vector and a second data vector. The method also comprisestransforming the first data vector into a first canonical vector,wherein the first canonical vector includes a first component indicativeof information in the first data vector and information in the seconddata vector, and a second component indicative of information in thefirst data vector and substantially exclusive of information in thesecond data vector. Similarly, the method comprises transforming thesecond data vector into a second canonical vector, wherein the secondcanonical vector includes a first component, indicative of informationin the first data vector and information in the second data vector, anda second component, indicative of information in the second data vectorand substantially exclusive of information in the first data vector. Themethod also comprises generating a common information vector based onthe first component of the first canonical vector and the firstcomponent of the second canonical vector. Similarly, the methodcomprises generating a first private vector based on the first canonicalvector and the common information vector, and a second private vectorbased on the second canonical vector and the common information vector.The method also comprises compressing the first private vector at afirst private rate to generate a first digital message and compressingthe second private vector at a second private rate to generate a seconddigital message. The method comprises computing an amount of commoninformation contained in the common information vector, and, based onthe amount of common information, computing a third rate. The methodalso comprises compressing the common information vector at the thirdrate to generate a third digital message. The method comprises routingthe first digital message via a first channel, the second digitalmessage via a second channel and the third digital message via a thirdchannel.

In another aspect, a non-transitory computer-readable medium storesinstructions for compressively encoding two correlated data vectors,wherein the instructions, when executed by one or more processors of acomputing system, cause the one or more processors to obtain a firstdata vector and a second data vector. The instructions also cause theone or more processors to transform the first data vector into a firstcanonical vector, wherein the first canonical vector includes a firstcomponent indicative of information in the first data vector andinformation in the second data vector, and a second component indicativeof information in the first data vector and substantially exclusive ofinformation in the second data vector. Similarly, the instructions causethe one or more processors to transform the second data vector into asecond canonical vector, wherein the second canonical vector includes afirst component, indicative of information in the first data vector andinformation in the second data vector, and a second component,indicative of information in the second data vector and substantiallyexclusive of information in the first data vector. The instructions alsocause the one or more processors to generate a common information vectorbased on the first component of the first canonical vector and the firstcomponent of the second canonical vector. Also, the instructions causethe one or more processors to generate a first private vector based onthe first canonical vector and the common information vector, and asecond private vector based on the second canonical vector and thecommon information vector. The instructions also cause the one or moreprocessors to compress the first private vector at a first private rateto generate a first digital message and to compress the second privatevector at a second private rate to generate a second digital message.The instructions cause the one or more processors to compute an amountof common information contained in the common information vector, and,based on the amount of common information, to compute a third rate. Theinstructions also cause the one or more processors to compress thecommon information vector at the third rate to generate a third digitalmessage. The instructions cause the one or more processors to route thefirst digital message via a first channel, the second digital messagevia a second channel and the third digital message via a third channel.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic representation of an exemplary communicationnetwork for communicating two correlated data vectors.

FIG. 2 is a flow chart that represents a computer-implemented method ofcompressively encoding two correlated data vectors.

DETAILED DESCRIPTION

FIG. 1 is a schematic representation of an exemplary communicationnetwork 100 for communicating two correlated data vectors indicative of,for example, an exemplary object 102. The object 102 may represent ascene or other observable phenomenon in a real or virtual environment. Asensor 106 and a sensor 108 may be configured to generate analog ordigital signal representations of the object 102 to serve as inputs intothe network 100. The sensors 106, 108 may be audio transducers, videocameras, chemical sensors and/or any other suitable sensors capable ofobserving the real or virtual phenomenon. The sensors 106, 108 may bearray sensors configure to simultaneously generate the data vectorsincluding multiple values. Additionally or alternatively, the sensors106, 108 may be configured to generate the data vectors as datasequences of observed values. The sensors 106 and 108 may be identicalsensors, such as, for example, two of the same model of camera, or theymay be different from each other, such as cameras with differentresolutions or sensing different spectral regions. The sensors 106, 108may also be referred to herein as “transducers,” “input devices,”“signal generators,” “signal sources,” or, simply, “sources.”

The sensors 106, 108 may be communicatively connected to a pre-encoderunit 110 comprising two corresponding pre-encoders 112, 114. Thepre-encoder 112 may be in communicative connection with the sensor 106to pre-encode data vectors generated by the sensor 106, and thepre-encoder 114 may be in communicative connection with the secondsensor 108 to pre-encode data vectors generated by the sensor 108. WhileFIG. 1 illustrates the pre-encoders 112, 114 as separate components, insome implementations, a single pre-encoder is in communicativeconnection with both of the sensors 106, 108 and configured topre-encode data vectors generated by both of the sensors 106,108.Additional data processing elements (not shown) may intercede betweenthe sensors 106, 108 and the pre-encoders 112, 114, for example,amplifiers, signal conditioners, analog-to-digital (A/D) converters,and/or other suitable signal processing elements that suitably transformsignals from the sensors 106, 108 for improved processing by thepre-encoders 112, 114.

The pre-encoder unit 110 may be in communicative connection with anencoder unit 120 that comprises a common information encoder 122 (or,simply, common encoder 122) and two private encoders 124, 126. Moreparticularly, the common encoder 122 may be coupled to receive theoutput of each of the pre-encoders 112, 114 and configured to determinea set of common information between the data vectors generated by thesensors 106,108. As illustrated, an output of the common encoder 122 (X)is coupled to a common information compressor 123 and the outputs of theprivate encoders 124, 126 are coupled to respective compressors 125,127. The output of the common encoder 122 may also be coupled to the twoprivate encoders 124, 126. Each of the private encoders 124, 126 may beconfigured to combine the common information encoded by the commonencoder 122 and the respective outputs of the pre-encoders 112, 114 (Y₁and Y₂) to produce a respective private signals Z₁ and Z₂. Accordingly,as illustrated, the common compressor 123 is configured to compress thecommon information X to produce a compressed signal W₀, the privatecompressor 125 is configured to compress the private signal Z₁ toproduce a compressed signal W₁, and the private compressor 127 isconfigured to compress the private signal Z₂ to produce a compressedsignal W₂.

Three channels 132, 134, 136 may form part of a communicative pathbetween the encoder unit 120 and a decoder unit 140. For convenience,the channels 132, 134, 136 may be collectively referred to as thechannels 130. The channels 130 may include the physical medium betweenthe encoder unit 120 and the decoder unit 140, a variety of components(e.g., modulators, demodulators, filters, switches, amplifiers,antennas) configured to implement the logic of signal transmissionbetween the encoder unit 120 and the decoder unit 140, as well as anyunits of the system that implement the logic of transmission. Forexample, the channels 130 may comprise a variety of cables such astwisted pair, coaxial, fiber-optic or other suitable cables.Additionally or alternatively, the channels 130 may comprisesubstantially free-space media such as vacuum, air, water, ground, orother suitable media. In some embodiments, the channels 130 may,additionally or alternatively, include storage media that may hold thesignals for a certain time duration. The channels 130 may be configuredto support any known modulation schemes and protocols, includingtime-division multiplexing, frequency-division multiplexing,code-division multiplexing, and/or, more specifically, Ethernet, WiFi,Bluetooth, or any other suitable protocol. The channels may includemodulation, demodulation, losses, amplification, dispersion, dispersioncompensation, interference, cross-talk and other signal transformations.

The encoder unit 120 may be configured to communicate with the decoderunit 140 via the channels 130. More particularly, the decoder unit 140may include a common decompressor 143 configured to decompress thesignal produced by the common compressor 123 as received over thechannel 134 (also referred to as “the common channel” or “the publicchannel”) and private decompressors 141, 145 configured to decompressthe signals respectively produced by the private compressors 125, 127,as received over the channels 132, 136 (also referred to as “the privatechannels”). It should be noted that channels 130 may introduce noise orother distortions into the signals produced by the encoder unit 120. Forexample, the common information signal W produced by the commoncompressor 123 may not match the common information signal W₀ receivedat the decoder unit 140 after traversing the channel 134.

As illustrated, the decoder unit 140 includes decoders 142, 144configured to produce the signals encoded by the encoder unit 120.Accordingly, the decoder 142 is configured to combine the outputs of thecommon decompressor 143 and the private decompressor 141 to produce adecoded signal Ŷ₁. Similarly, the decoder 144 is configured to combinethe outputs of the common decompressor 143 and the private decompressor144 to produce a decoded signal Ŷ₂.

As illustrated, the output of decoder 142 is coupled to a firsttransducer 152 to produce a representation 162 of the object 102, assensed by the sensor 106. Similarly an output of the decoder 144 iscoupled to a second transducer 154 to produce a representation 164 ofthe object 102, as sensed by the sensor 108. Additional units (notshown) may intercede between the decoders 142, 144 and transducers 152,154, including, for example, amplifiers, signal conditioners,digital-to-analog (D/A) converters, and/or other units that suitablytransform signals such that the transducers 152, 154 may improve therespective representations 162, 164 of the object 102.

In operation, the sensor 106 generates a measurement of the object 102to produce a data vector, Y₁ that is routed to the pre-encoder 112.Similarly, the sensor 108 generates a measurement of the object 102 toproduce a data vector, Y₂ that is routed to the pre-encoder 114. Thesensors 106, 108 may directly generate the data vectors when, forexample, the sensors 106, 108 are digital sensor arrays. In otherimplementations, the signals generated by the sensors 106, 108 aretransformed, for example, into a digital form by an analog-to-digitalconverter, prior to routing to the pre-encoders 112, 114. While FIG. 1illustrates the sensors 106, 108 as being separate entities, in someimplementations, a single sensor may generate the signals that aretransformed into both data vectors Y₁ and Y₂.

The pre-encoder unit 110 may obtain the data vectors Y₁ and Y₂ from acomputer memory location or from non-transitory computer storage media.While the sensors 106, 108 may generate the data by measuring real-worldphenomena, the data may be partially or fully computer generated. Forexample, the data vectors may be two representations of a virtual objectin a multiplayer gaming environment, and/or may comprisecomputer-generated stereo sound and/or video.

It should be appreciated that arrangement of the sensors 106, 108observing a common scene that includes the object 102 is only onearrangement in which the sensors 106, 108 generate correlated data. Ingeneral, and regardless of their origin, the first data vector Y₁ andthe second data vector Y₂ may be different size from each other (i.e.,may have a different number of elements), may be generated in differentways and at different times from each other, and may differ in the typeof information represented within the data. The methods described in thefollowing discussion may apply to any two correlated data vectors.

The pre-encoder unit 110 may transform Y₁ into the first canonicalvector, Y₁ ^(C), and transform Y₂ into the second canonical vector, Y₂^(C). The form of the canonical vectors may be referred to as “canonicalvariable form.” The transformation of the data vectors into thecanonical variable form may enable the encoding unit 120 (and/or thecommon encoder 122) to act upon particular components of the resultingcanonical vectors Y₁ ^(C) and Y₂ ^(C), such as components that arecorrelated to each other, and/or components that are substantiallyindependent from each other. The canonical variable form may define afirst component that includes a set of elements of the first or secondcanonical vectors that is correlated (but not identical) to the other ofthe first or second canonical vectors. The canonical variable form mayalso define a second component that includes a set of elements of thefirst or second canonical vector that are substantially independent ofthe other of the first or second canonical vectors. That is, for thefirst canonical vector, the first component includes a set of elementsthat is correlated to the second canonical vector and the secondcomponent includes a set of elements that is substantially independentof the second canonical vector. Similarly, for the second canonicalvector, the first component includes a set of elements that iscorrelated to the first canonical vector and the second componentincludes a set of elements that is substantially independent of thefirst canonical vector. Accordingly, the second component of the firstcanonical vector and the second component of the second canonical vectormay be constructed in a such a way that the second components aresubstantially independent from each other. The first components of thecanonical vectors may be indicative of information in the first datavector and information in the second data vector. The second componentsof the canonical vectors may be indicative of information in thecorresponding data vectors and substantially exclusive of information inthe non-corresponding data vectors.

The correlation and/or substantial independence of components of thecanonical vectors Y₁ ^(C) and Y₂ ^(C) may be formally described in termsof statistics or probability theory. The vectors Y₁ ^(C) and Y₂ ^(C) maybe described in terms of their probability distributions or probabilitydensity functions (PDFs), that define a probability that a value of avector lies in a region within the possible values. The PDFs may beestimated by observing many instances of the vectors or by computingtheoretical distributions from the underlying mathematical and/orphysical principals involved in the formation of the signals. The datavectors, or any transformations thereof, may thus be treated asso-called random variables. When knowing a value of a vector elementdoes not substantially alter the PDF for a value of another vectorelement of the same or of a different vector, the two elements may bereferred to as substantially independent. A threshold for substantialindependence may be chosen as percent changes in PDFs that are less than0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10 or any other suitablethreshold. The threshold may be defined in terms of absolute values ofprobability or relative changes.

When, on the other hand, knowing a value of a vector element alters thePDF for a value of another vector element of the same or of a differentvector, the two elements may be referred to as correlated. In practice,the change in PDF of one element due to knowledge of the value ofanother element may need to exceed a threshold before the elements maybe considered correlated. The threshold value for correlation may be thesame as the threshold for substantial independence in which changes inPDF greater or equal to the threshold implies a correlation.

With respect to FIG. 1, the sensors 106, 108 may, for example, captureimage data representative of the object 102 at two different angles,with some pixels of the sensor 106 representing the same portion of theobject 102 as some pixels of the sensor 108. Given that values generatedby sensor pixels generally exhibit a certain PDF that may correspond to,for example, surface brightness statistics for a class of imagedobjects, pixel values from the sensor 106 may predict a range of pixelvalues from the sensor 108 more effectively than the original(unconditional) PDF. A conditional PDF for the pixels values from thesensor 108 may be estimated, based on pixel values from the sensor 106.The conditional PDF, additionally or alternatively, may be computed fromthe estimation of the joint PDF for the two data vectors. For the pixelsimaging the same area of the object 102, the conditional PDF may besignificantly different from the original PDF; thereby implying that acorrelation between the pixel values exists.

In some implementations, covariance matrices (that can also be referredto as variance matrices) are used to indicate correlation andindependence among vector elements. The covariance matrix of the twocanonical vectors, referred to herein as the canonical covariance (orvariance) matrix, Q_(cvf), may be used to determine the first componentof the first canonical vector and the correlated first component of thesecond canonical vector. Additionally or alternatively, the covariancematrix of the two canonical vectors may indicate the second component ofthe first canonical vector and the independent second component of thesecond canonical vector. Furthermore, the transformation by thepre-encoder unit 110 may ensure that the first canonical vector elementsare substantially independent from each other, and the second canonicalvector elements are substantially independent from each other.Additionally or alternatively, the pre-encoder unit 110 may transformthe first and second data vectors to ensure that for every element ofthe first canonical vector that has only one correlated element in thesecond canonical vector. That is, there may be a one-to-one elementcorrespondence between the first component of the first canonical vectorand the first component of the second canonical vector.

As described in more detail below, the canonical covariance matrix maybe computed from the covariance matrix of the data vectors Y₁ and Y₂when the data vectors are modeled as instances of multivariate Gaussianrandom variables. Gaussian random variables may effectively model a widerange of practical signals. In some implementations, where the Gaussianrandom variable models do not model the data vectors sufficiently well,the pre-encoder unit 110 may apply a transformation to the data vectorsto improve their conformity to the Gaussian representations. Thecovariance matrix of the data vectors modeled as instances of Gaussianrandom variables may be obtained by statistical observations of the datavectors, computation from the physical and mathematical models of thedata, and/or by other suitable means.

After obtaining the covariance matrix for the first data vector and thesecond data vector, the pre-encoder unit 110 may compute the canonicalcovariance matrix. Using the canonical covariance matrix, thepre-encoder unit 110 may then compute a first transformation matrix, S₁,for the first data vector and a second transformation matrix, S₂, forthe second data vector. Subsequently, the pre-encoder unit 110 maycompute the first canonical vector as Y₁ ^(C)=S₁Y₁ and the secondcanonical vector as Y₂ ^(C)=S₂Y₂.

The pre-encoder unit 110 may route the canonical vectors, canonicalcovariance matrix and/or other matrices as inputs to the encoder unit120. In some implementations, the encoder unit 120 re-calculates thecanonical covariance matrix by measuring the statistics of the canonicalvectors. The encoder unit 120 may compute three vectors, X, Z₁, and Z₂based on the components of the canonical vectors. Vectors X, Z₁, and Z₂may be collectively referred to as “message vectors.” The common encoder122 may compute a common information vector, X, (that can also bereferred to as “the common vector”) based on the first component of thefirst canonical vector and the first component of the second canonicalvector. The first private encoder may compute a first private vector,Z₁, (that can also be referred to as “the first private informationvector”) based on the first canonical vector and the common informationvector, X. The second private encoder may compute a second privatevector, Z₂, (that can also be referred to as “the second privateinformation vector”) based on the first canonical vector and the commoninformation vector, X.

In some embodiments, the encoder unit 120 may obtain a requiredtransmission quality for the data vectors and/or canonical vectors. Therequired transmission quality may be, for example, a distortion leveland a distortion function for the first data vector and a distortionlevel and a distortion function for the second data vector. Thedistortion levels may refer to upper limits on average distortionscomputed using distortion functions of reconstructing the two datavectors, one for each data vector, when a suitably large number of datavectors are communicated through the network 100. The distortion levelof the required transmission quality also may be referred to herein as“required distortion level,” “maximum distortion level,” or “maximumdistortion.” The distortion functions may refer to Euclidean distancesor other norms of differences between the original and the estimateddata vectors, and/or the corresponding formulas. In someimplementations, the distortion levels are specified for the canonicalvectors. The values generated by applying distortion functions to dataand/or canonical vectors and their reconstructions and/or estimates maybe referred to, simply, as “distortions.” In some implementations, theencoder unit 120 generates three message bit streams, that include twoprivate message bit streams and a common message bit stream, with theproperty that the two average distortions are below their correspondingdistortion levels for the canonical vectors. Based on the distortionfunctions and the distortion levels, the encoder unit 120 may compute arate region, using, for example, rate distortion functions, conditionalrate distortion functions, and joint rate distortion functions for jointrandom variables, for each of the random variable models of the datavectors and/or each of the canonical vectors. Additionally oralternatively, the encoder unit 120 computes the rate region at least inpart based on joint rate distortion functions for joint random variablemodels of the data vectors and/or canonical vectors, on the conditionalrate distortion functions for the random variables modeling thecanonical vectors conditioned on the common information vector, and/oron the amount of information contained in the common information vector.The amount of common information contained in the two data vectorsand/or the amount of information in the common information vector may becomputed, for example by the encoder unit 120, as Wyner's lossy commoninformation for the two data vectors and/or the two canonical vectors.

To efficiently compress the three message vectors, the encoder unit 120may compute three rates, R₀, R₁, R₂, that respectively represent anumber of bits in digital messages encoding each of the correspondingmessage vectors, X, Z₁, and Z₂. For convenience, rate R₁ for the firstprivate message vector may be referred to as the first private rate or,shorter, the first rate, rate R₂ for the second private message vectormay be referred to as the second private rate or, shorter, the thirdrate, and rate R₀ for the common information vector may be referred toas the common rate or the third rate. The encoder unit 120 may selectrates such that they lie in or in sufficiently close proximity to thePangloss plane of the rate region of the Gray-Wyner network or ofanother suitable rate region, and given by R₀+R₁+R₂=R(Δ₁, Δ₂), whereR(Δ₁, Δ₂) is the joint rate distortion function of the two data vectorsgiving the minimal rate for reconstructing the two data vectors at theoutputs of the decoder unit, while achieving average distortions nobigger than distortion levels Δ₁, Δ₂. The encoder unit 120 may obtainthe quality requirement including the distortion functions and thedistortion levels that may represent maximum average distortionsevaluated using the distortion functions. In some implementations, thedistortion functions may include the first distortion function for thefirst data vector and/or for the first canonical vector and the seconddistortion function for the second data vector and/or the secondcanonical vector. The encoder unit 120 and decoder unit 140 may operatewith average distortion functions not exceeding the distortion levelsΔ₁, Δ₂ that may represent the corresponding transmission qualityrequirements for the data vectors Y₁, Y₂ or the canonical vectors Y₁^(C), Y₂ ^(C) that may be reproduced or estimated at the decoder unit140. In some applications, the encoder unit 120, may compute the Δ₁, Δ₂based on channel capacity. In any case, the Pangloss plane may representthe optimal rates. An optimal rate region may be defined by rates thatare in sufficiently close proximity to the Pangloss where the sum of therates R₀+R₁+R₂, is within a certain percent (0.1, 0.2, 0.5, 1, 2, 5 orany other suitable number) of the optimal value given by the ratedistortion function R(Δ₁,Δ₂). T

In some implementations, the encoder unit 120 may compute a lower boundfor the sum of the first private rate, the second private rate, and thethird rate (the common rate) by evaluating one or more rate distortionfunctions using water filling techniques based on the two canonicalvectors, the two data vectors, and/or covariance or correlation matricesof canonical vectors and/or data vectors. For example, using waterfilling techniques may be based on canonical correlation coefficients ofthe first canonical vector and the second canonical vector, derived fromthe canonical covariance matrix. The one or more rate distortionfunctions may include a joint rate distortion function for the firstdistortion function and the first distortion level in conjunction withthe second distortion function and the second distortion level. Thefirst distortion function may be the same as the second distortionfunction, and may be referred to, simply, as “distortion function.” Theencoder unit 120 may compute the first private rate, the second privaterate, or the third rate at least in part based on the lower bound forthe sum of the rates.

Based on the computed rates, the compressors 123, 125, 127 may generatethree digital messages, W₀, W₁, W₂, by compressing, respectively, thecommon information vector X with rate R₀, the first private vector Z₁with rate R₁, and the second private vector Z₂ with rate R₂. Forconvenience, the digital messages may be referred to as the firstdigital message W₁, the second digital message W₂, and the third digitalmessage W₀. The digital messages may be generated from sequences ofmessage vectors. For example, the common compressor 123 may generate thecommon (third) digital message W₀ by compressing a sequence of Nindependent common information vectors X₁, X₂, . . . , X_(N-1), X_(N) toNR₀ bits. The first private compressor 125 may generate the firstdigital message W₁ by compressing N independent first privateinformation vectors Z_(1,1), Z_(1,2), . . . , Z_(1,N−1), Z_(1,N) to NR₁bits. The second private compressor 127 may generate the second digitalmessage W₁ by compressing N independent first private informationvectors Z_(2,1), Z_(2,2), . . . , Z_(2,N−1), Z_(2,N) to NR₂ bits.

After computing the rates, the compressors 123, 125, 127 may use latticecoding, transform coding, joint entropy dithered quantization coding(ECDQ), and other known coding techniques to generate the three digitalmessages.

The compressors 123, 125, 127 of the encoder unit 120 may route thegenerated digital messages via the channels 130 to the decoder unit 140.Routing via the channels 130 may include converting the digital messagesinto analog voltage waveforms using modulation and transmitting theanalog waveforms as radio waves either wirelessly (in free space) orthrough cables. Routing may include storing the first digital message inthe first memory location, storing the second digital message in thesecond memory location, and storing the third digital message in thethird memory location. Prior to decoding, the decoder unit 140 mayreceive the waveforms routed by the encoder unit 120 and demodulate thewaveforms into digital messages. Additionally or alternatively, thedecoder unit 140 may retrieve digital messages from memory locationsbefore decoding.

Generally, the digital messages at the output of the channels 130 maycontain distortions caused by the channel and can be referred to asnoisy digital messages. In some implementations, lossy compressioncauses these distortions. The decoder unit 140 may obtain the firstnoisy digital message W₁, the second noisy digital message with W₂, andthe third noisy digital message W₀ via the channels 130. The channels130 may distort the messages with non-linear effects, spectralfiltering, interference, and additive noise. In some applications, areceiver compensates for the preceding channel effects with equalizationand other distortion compensation, adding the compensation to theoverall channel effect. Furthermore, the channels 130 may implementchannel encoding and channel decoding units to mitigate the effects ofchannel distortions and noise.

The channels 130, including equalization and distortion compensationunits, may result in the dominant effect of additive noise. The noisemay be additive white Gaussian noise (AWGN). The techniques andalgorithms described herein may have particular mathematicalsignificance, such as near-optimal performance, for AWGN channels.Nevertheless, the techniques and algorithms described herein apply toother noise distributions, including, for example, Poisson, Gamma,and/or Ricean distributions.

In some applications, the effects of channel noise are be negligible,and one or more of the noisy messages may equal their original versionsat the outputs of the compressors 123, 125, 127 (e.g., {tilde over(W)}₀=W₀, {tilde over (W)}₁=W₁, and/or {tilde over (W)}₂=W₂). The amountof additive noise in a channel may affect its capacity. In someapplications, the encoder unit 120 selects the rates to optimize the useof the channel capacities. In other applications, the encoder unit 120may assign capacities to the channels 132, 134, and 136 based on thecomputed rates.

At the decoder unit, three decompressors 141, 143, and 145 may applysuitable decoding schemes to reverse the compression applied by thecorresponding compressors 125, 123, and 127. The common decompressor 143may generate an estimated common information vector X based on the thirdnoisy digital message {tilde over (W)}₀. The private decompressor 141may generate an estimated first private vector {tilde over (Z)}₁ basedon the first noisy digital message {tilde over (W)}₁. The privatedecompressor 145 may generate an estimated second private vector {tildeover (Z)}₂ based on the second noisy digital message {tilde over (W)}₂.In applications where the third digital message {tilde over (W)}₀=W₀ atthe output of the common channel 134 is a noiseless reproduction, theestimated common information vector X=X may be equal to the output ofthe common encoder 122. In applications where the digital messages{tilde over (W)}₁=W₁ and {tilde over (W)}₂=W₂ at the outputs of theprivate channels 132, 136 are noiseless reproductions of the inputs tothe channels, the estimated first private vector {tilde over (Z)}₁ andthe estimated second private vector {tilde over (Z)}₂ may neverthelesscontain distortions. Lossy compression by the private compressors 125and 127 may result in distortions of the private vectors Z ₁ and Z ₂even when the private channels 132 and 136 are noiseless. Thedistortions may satisfy the transmission quality requirements. In someapplications, the private encoders 124, 126, private compressors125,127, and/or common encoder 123 compute rates that satisfy thetransmission quality requirement in view of channel capacity of multipleaccess channels, broadcast channels, interference channels, or othersuitable channels.

The decoder unit 140 may route the outputs of the decompressors 141,143, and 145 to the decoder 142 and the decoder 144. The decoder 142 maycombine the estimate of the common information vector X and the estimateof the first private vector Z ₁ to compute an estimate of the first datavector Y₁. The decoder 144 may combine the estimate of the commoninformation vector X and the estimate of the second private vector Z ₂to compute an estimate of the second data vector Y₂. The decoders 142,144 may compute the estimates of the data vectors Ŷ₁ and Ŷ₂ in twosteps. In the first step, the private decoders 142, 144 may computeestimates of the first canonical vector Ŷ₁ ^(C) and the second canonicalvector Ŷ₂ ^(C). In the second step, the two decoders 142, 144 maytransform the estimates of the canonical vectors Ŷ₁ ^(C) and Ŷ₂ ^(C) tothe estimates of data vectors Ŷ₁ and Ŷ₂ by applying inverses of thetransformations for generating the the canonical vectors from the datavectors. In some applications, the decoders 142, 144 compute and applyinverse transformations that are matrix inverses of S₁ and S₂. In otherapplications, the decoders 142, 144 compute and apply inversetransformations that are regularized forms of the inverses, therebyimproving conditioning of the transformations.

The first transducer 152 and the second transducer 154 may render theestimated data vectors Ŷ₁ and Ŷ₂ to produce, correspondingly, the firstrepresentation 162 and the second representation 164 of the object 102.In some implementations, the first transducer 152 and the secondtransducer 154 are of the same type, e.g. displays, sections of adisplay, speakers, etc., and render similar types of information. Inother implementations, the transducers 152, 154 are of different typesand/or render different types of information. In some implementations,the two representations 162, 164 serve as complementary information fora user, while in other applications the first representation 162 isrendered for the first user and the second representation 164 isrendered for the second user. The rendering of the first representation162 may be secret from the second user, and/or the rendering of thesecond representation 164 may be secret from the second user.

FIG. 2 is a flow chart that represents a computer-implemented method 200of compressively encoding two correlated data vectors. The methodapplies, for example, to the encoding portion of the communicationnetwork 100 illustrated in FIG. 1. The method 200 may be implementedusing computer hardware including one or more processors. The one ormore processors may include central processing units (CPUs), graphicalprocessing units (GPUs), digital signal processors (DSPs), and/or othersuitable processors. Additionally or alternatively, hardware devicessuch as application specific integrated circuits (ASICs) orfield-programmable gate arrays (FPGAs) may implements at least portionsof the method 200. The hardware and software configured to implement themethod is referred to herein as a computing system. Instructions toimplement the method 200 may be executed by the one or more processorsof the computing system.

A non-transitory computer-readable medium may store instructions for themethod 200 to compressively encode two correlated data vectors. Thecomputer-readable medium may be a hard drive, a flash drive, an opticalstorage medium, and/or any other suitable medium. One or more processorsof the computing system may execute the instructions to perform theactions in the blocks 210-280 of the method 200.

At block 210, the computing system may obtain a first data vector and asecond data vector. For example, in the communication network 100, thefirst and second data vectors may be generated by the sensors 106, 108and communicated to the pre-encoder unit 110 implemented by thecomputing system. Without any loss of generality, the first data vectoris referred to herein as Y₁ and the second data vector is referred toherein as Y₂. The computing system may analyze the first data vector andthe second data vector at block 210, to determine whether to proceedwith and/or adjust the subsequent steps of the method 200. For example,the computing system may determine whether the first data vector and thesecond data vector are substantially representative of correlatedGaussian random variables. To make the determination, the computingsystem may analyze statistical distributions and/or covariance of vectorelements from multiple pairs of data vectors, including the first datavector and the second data vector. The computing system may determinethat a statistical distribution is substantially representative of aGaussian random variable based on a quality of fit of the distributionto a Gaussian function. The computing system, additionally oralternatively, may compute a probability that a set of data vectorsincluding the first data vector and the second data vector are generatedby a joint Gaussian distribution within the neighborhood of the set ofdata vectors. The computing system may subsequently determine that thefirst data vector and the second data vector are substantiallyrepresentative of correlated Gaussian random variables at least in partby comparing the computed probability to a threshold.

At block 220, the computing system may transform the first data vectorinto a first canonical vector and the second data vector into a secondcanonical vector. Without any loss of generality, the first canonicalvector is referred to as Y₁ ^(C) and the second canonical vector isreferred to as Y₂ ^(C). In the example network 100 of FIG. 1, thecanonical vectors are generated by the pre-encoder unit 110. In otherimplementations, different logical units generate the canonical vectors.For example, in some implementations, an encoder unit, such as theencoder unit 120, transforms the data vectors to the canonical vectors.

The first canonical vector may contain the same information as the firstdata vector, but in canonical variable form. Likewise, the secondcanonical vector may contain the same information as the second datavector, but in canonical variable form. In some implementations, thecanonical variable form is a form that ensures that the covariancematrix of the two canonical vectors substantially conforms to thecanonical variable form for a covariance matrix, as expressed by theequation below:

$\begin{matrix}{{Q_{cvf} = {{{Cov}\begin{pmatrix}Y_{1}^{C} \\Y_{2}^{C}\end{pmatrix}} = \begin{pmatrix}I_{p_{11}} & 0 & 0 & I_{p_{21}} & 0 & 0 \\0 & I_{p_{12}} & 0 & 0 & D & 0 \\0 & 0 & I_{p_{13}} & 0 & 0 & 0 \\I_{p_{21}} & 0 & 0 & I_{p_{21}} & 0 & 0 \\0 & D & 0 & 0 & I_{p_{22}} & 0 \\0 & 0 & 0 & 0 & 0 & I_{p_{23}}\end{pmatrix}}},} & \left( {{Eq}.\mspace{14mu} A} \right)\end{matrix}$

where I_(p11), I_(p12), I_(p13), I_(p21), I_(p22), I_(p23) are identitymatrices of respective dimensions p₁₁, p₁₂, p₁₃, p₂₁, p₂₂, p₂₃ (wherep₁₁=p₂₁ and p₂₁=p₂₂) and D is a diagonal matrix of dimension p₁₂. Thedimension p₁₁+p₁₂+p₁₃=p₁ is the size of the first canonical vector Y₁^(C), while the dimension p₂₁+p₂₂+p₂₃=p₂ is the size of the secondcanonical vector Y₁ ^(C). The diagonal matrix D may contain thecanonical correlation coefficients or canonical singular values of thefirst and second canonical vectors.

In the following discussion, substantially conforming to the canonicalvariable form of the covariance matrix may refer to having a covariancematrix where the elements that correspond to the zero elements ofQ_(cvf) in Eq. A are within a predetermined threshold of zero, while theelements that correspond to the unity elements of Q_(cvf) in Eq. A arewithin a predetermined threshold of unity. For example, covariancematrix elements below 0.05, 0.02, 0.01, 0.005, 0.002 or any othersuitable number may be considered substantially zero. Analogously,covariance matrix elements above 0.95, 0.98, 0.99, 0.995, 0.998 or anyother suitable number may be considered substantially equal to unity.The transformations at block 220 may ensure that there are onlynon-negative values in the covariance matrix.

According to Eq. A, to substantially conform to the canonical variableform, all of the elements of the first canonical vector may be requiredto be substantially independent from each other and have substantiallyunity variance. Likewise, substantially conforming to the canonicalvariable form may require that all of the elements of the secondcanonical vector are substantially independent from each other and havesubstantially unity variance. According to Eq. A, the first and secondcanonical vectors may contain elements each that form identical pairswith each other. Thus, the first canonical vector may contain acomponent that is identical or substantially identical to a component ofthe second canonical vector. In some scenarios, p₂₁=p₁₁=0, and there areno identical or substantially identical components in the first andsecond canonical vectors.

The elements of the diagonal sub-matrix D in Eq. A, with values 1>d₁≥d₂≥. . . ≥d_(p) ₁₂ >0, the canonical covariance or correlationcoefficients, indicate that there are p₁₂=p₂₂ elements in each of thetwo canonical vectors that are pairwise correlated with each other. Thecorresponding elements may form a first component of the first canonicalvector and a first component of the second canonical vector, whereineach of the two correlated components are indicative of the same and/orcommon information. Because the first canonical vector containsinformation from the first data vector and the second canonical vectorcontains information from the second data vector, the first component ofthe first canonical vector may contain information indicative of thefirst data vector and information indicative of the second data vector.Analogously, because the first components of the canonical vectors arecorrelated, the first component of the second canonical vector maycontain information indicative of both, the first data vector and thesecond data vector.

Furthermore, according to Eq. A, the first canonical vector contains p₁₃elements that are substantially independent from all of the elements inthe second canonical vector. These elements form a second component ofthe first canonical vector. Analogously, the second canonical vectorcontains p₂₃ elements that are substantially independent from all of theelements in the first canonical vector. These elements form a secondcomponent of the second canonical vector. Therefore, the secondcomponent of the first canonical vector may be indicative of informationin the first data vector and substantially exclusive of information inthe second data vector. Analogously, the second component of the secondcanonical vector may be indicative of information in the second datavector and substantially exclusive of information in the first datavector.

At block 220, the computing system may transform the data vectors usingmatrix multiplication: Y₁ ^(C)=S₁Y₁ and Y₂ ^(C)=S₂Y₂. In oneimplementation, the pre-encoder unit 110 computes the nonsingulartransformation matrix for the first data vector S₁ and the nonsingulartransformation matrix for the second data vector S₂ based on acovariance matrix of the data vectors according to Algorithm 2 describedbelow. Other logical units, such as, for example, an encoder, configuredwithin the computing system that implements the method 200 may alsoimplement Algorithm 2 and compute the transformation matrices. Tocompute the transformation matrices, the computing system may estimatethe covariance of the data vectors by analyzing multiple observations ofthe data vectors. In other implementations, the computing systemcomputes the covariance matrix for the data vectors based on thephysical and/or mathematical principals underlying the generation of thedata vectors. The computing system may obtain the covariance matrix forthe first data vector and the second data vector in other ways, that mayinclude, for example, altering, adjusting, or updating a pre-computedcovariance matrix based on changes in data acquisition geometry,analysis of observed data vectors, and/or other information. Forexample, at block 220, the computing system may, for every new pair ofdata vectors, analyze a difference between the first data vector and thesecond data vector to modify the covariance matrix. The computing systemmay perform the analysis on multiple pairs of data vectors, and makeadjustments to the covariance matrix in view of confidence and/oruncertainty in the values of the covariance matrix prior to theadjustment.

As discussed above, each of the canonical vectors may contain threecomponents, with covariance matrix of Q_(cvf) in Eq. A, with the threecomponents of each canonical vector designated by

$\begin{matrix}{{Y_{1}^{C} = \begin{pmatrix}Y_{11} \\Y_{12} \\Y_{13}\end{pmatrix}},{Y_{2}^{C} = \begin{pmatrix}Y_{21} \\Y_{22} \\Y_{23}\end{pmatrix}},} & \left( {{Eq}.\mspace{14mu} B} \right)\end{matrix}$

where Y₁₂, Y₂₂ may be the correlated first components with thecovariance matrix given by D, that may contain the canonical covarianceor correlation coefficients, Y₁₃, Y₂₃ may be the substantiallyindependent second components with a substantially zero covariancematrix, and the Y₁₁, Y₂₁ may be substantially identical components withthe covariance matrix that may be a substantially identity matrix.Herein, Y₁₁ may be referred to as a third component of the firstcanonical vector, and Y₂₁ may be referred to as a third component of thesecond canonical vector.

At block 230, the computing system may generate a common informationvector, a first private vector, and a second private vector. Moreparticularly, the computing system may generate the first privatevector, the second private vector, and the common information vector inresponse to determining that the first data vector and the second datavector are substantially representative of correlated Gaussian randomvariables. To generate the common information vector X based on thefirst component of the first canonical vector and the first component ofthe second canonical vector, the computing system may use Eq. 178 givenbelow. The common information vector generated by the computing systemmay include the third component of the first canonical vector or thethird component of the second canonical vector, as specified, forexample, in Eqs. 176 and 177 below. In some scenarios, the computingsystem may average the substantially identical third components andinclude the average in the common information vector. In the exemplarycommunication network of FIG. 1, the common encoder 122 may compute thecommon information vector using the canonical vector components andequations described above. To generate the two canonical vectors thecomputing system may use Eqs. 183-185 below or, by example, Eqs. 98-107.

The computing system may compute the first private information vectorbased on the first canonical vector and the common information vector atblock 230 using Eqs. 179, 181, and 182 below. Using the equations, thecomputing system may compute one component of the first privateinformation vector by subtracting a linear transformation of at least aportion of the common information vector from the first component (Y₁₂)of the first canonical vector. The computing system may compute anothercomponent of the first private information vector by including theelements of the second component of the first canonical vector. Thus,the first component of the first canonical vector, with the commoninformation removed, may be concatenated with the second component ofthe first canonical vector to form the first private information vector,Z₁ of size p₁₂+p₁₃. Similarly, at block 230, the computing system mayalso compute the second private information vector using, for example,Eqs. 180-182. The first component of the second canonical vector, withthe common information removed, may be concatenated with the secondcomponent of the second canonical vector to form the second privateinformation vector, Z₂ of size p₂₂+p₂₃. The first private encoder 124and the second private encoder 126 of the network 100 may generate,respectively, Z₁ and Z₂, by implementing, for example, the operationsdescribed in Eqs. 179-182. By example, the computer system may computethe first and second private information vector using Eqs. 103, 104.

At block 240, the computing system may compress the first private vectorZ₁ at a first rate to generate a first digital message and the secondprivate vector Z₂ at a second private rate to generate a second digitalmessage. The first rate, R₁, may indicate the number of bits used torepresent the first private vector, while the second private rate, R₂,may indicate the number of bits used to represent the second privatevector. In some implementations, the computing system may simultaneouslycompress information from a sequence of N pairs of data vectors,resulting in N pairs of private information vectors, including the firstand the second private information vectors. For example, the firstprivate compressor 125 may simultaneously compress N private informationvectors including the first private information vector to generate thefirst digital message of NR₁ bits. Analogously, the second privatecompressor 127 may simultaneously compress N private information vectorsincluding the second private information vector to generate the seconddigital message of NR₂ bits.

To compute the first rate R₁ and the second private rate R₂, thecomputing system may obtain a transmission quality requirement andcompute a rate region based at least in part of the obtainedtransmission quality requirement. The transmission quality requirementmay include a first distortion level and a first distortion function anda second distortion level and a second distortion function. The firstdistortion level may correspond to the highest level of acceptableexpected or average distortion in the decoded first data vector and/orin the first canonical vector. For a given data vector, distortion mayexceed the maximum distortion. That said, the average distortion for alarge collection of data vectors may still be limited to the maximumdistortion. The computing system may compute distortion as a squareerror distortion, given by Eqs. 114, 115, that squares the element-wisedifferences between the vector before encoding and the vector afterdecoding. The computing system may compute the rate region by evaluatingrate distortion functions in Eqs. 108-111 to determine the minimumprivate rates and/or the sum rate in Eq. 116 that may be needed tosatisfy the required transmission quality. In some applications, thecomputing system may evaluate the relationship between the rates and thedistortion by changing rates, measuring distortion, and storing anobserved rate-distortion function for reference. In someimplementations, a first rate distortion function and a second ratedistortion function may correspond to a rate distortion function for thefirst distortion function and a rate distortion function for the seconddistortion function. The first distortion function may be the same asthe second distortion function.

The computed rate region may be a Gray-Wyner lossy rate region for aGray-Wyner network such as, for example, the communication network 100of FIG. 1. The Gray-Wyner lossy rate region may include the collectionof all rate triples (R₀, R₁,R₂), correspondingly, of the third rate, thefirst rate, and the second private rate, at which the correspondingindependent vectors (X, Z₁,Z₂) may be transmitted over a Gray-Wynernetwork with noiseless channels and decoded into two estimates of thedata vectors, satisfying the transmission quality requirement. The firstrate may be referred to as the first private rate, the second rate maybe referred to as the second private rate, and the third rate may bereferred to as the common rate or the common information rate. The sumof the rates, R₀+R₁+R₂, equal to the joint rate distortion function ofthe first maximum distortion and the second maximum distortion, maydefine a Pangloss plane for the Gray-Wyner lossy rate region, accordingto Eq. 94. The computing system may compute the rates on the Panglossplane of the Gray-Wyner lossy rate region using Eq. 95.

The first rate on the Pangloss plane may correspond to a lower bound forthe first rate in the Gray-Wyner lossy rate region according to Eqs.43-48 or Eqs. 49 and 50. The computing system may compute the lowerbound for the first private rate by evaluating the rate distortionfunction for the first distortion that is no greater than the firstdistortion level, using Eqs. 108-109, and the classical water fillingalgorithm, for example, based on the canonical correlation coefficients.The second rate on the Pangloss plane may correspond to a lower boundfor the second private rate in the Gray-Wyner lossy rate region. Thecomputing system may compute the lower bound for the second private rateby evaluating the rate distortion function for the second distortion,that is no greater than the second distortion level, using Eqs. 110-111,and, for example, the classical water filing algorithm, for example,based on the canonical correlation coefficients.

At block 250, the computing system may compute the amount of commoninformation in the common information vector by computing, for example,an amount of Wyner's common information using Eq. 88, and based on Eq.50. In some applications, the computing system removes the substantiallyidentical components of the canonical vectors from the commoninformation vector to ensure a finite value of the Wyner's commoninformation finite. The amount of Wyner's lossy common information basedon Eq. 49 and Eq. 50 in the common information vector may correspond tothe third rate on the Pangloss plane. At block 260, the computingsystem, using, for example, the common encoder 122, may assign thecomputed amount of Wyner's lossy common information to the third rateR₀, to place the third rate in the Pangloss plane. Thus, the computingsystem may prioritize the third rate and reduce the third rate as muchas possible while still satisfying transmission requirements. Thesubstantially prioritized minimum third rate may offer the advantage ofefficiency because the decoder unit 140 may, in some implementations, bereplaced with two decoder units, one for each data vector, each onedecompressing the common digital message at the third rate. In someimplementations, the computing system may compute the amount of commoninformation by estimating the entropy of the collection of vectorsincluding the common information vector, generated, for example, by thecommon encoder 122. At block 260, the computing system may computeand/or update the third rate R₀ based on the amount of commoninformation computed at block 250. The computing system may set thethird rate to the amount of the common information, or may choose therate to be 5%, 10%, 20% or any other suitable margin above the amount ofthe common information. In some implementations, the computing systemmay compute an acceptable maximum distortion in the common informationvector and compute the third rate based on the acceptable maximumdistortion.

In some implementations, the computing system may implement one or moretest channels that may be realized or computationally implemented withadditive Gaussian noise, for example. The computing system may use thetest channels to implement encoders for compressing the first privatevector, the second private vector and the common vector at the firstprivate rate, the second private rate, and the third rate. For example,the system may evaluate distortion in the first and second data vectorsor in the first and second canonical vectors reconstructed frominformation transmitted through at least one of the one or more testchannels. Additionally or alternatively, the test channels may be usedin conjunction with water filling solutions for implementing encoders.The test channels may be used to model the rate distortion function ofthe first private rate, the rate distortion of the second private rate,and/or the joint rate distortion function of the sum of the first rate,the second rate, and the common rate,

In some implementations, the computing system may obtain a channelcapacity for the first channel, the second channel, and/or the thirdchannel. The channel capacity may be limited by available frequencybandwidth, speed of available hardware, noise, power limitations and/orother constraints. Furthermore, the channel capacity may varydynamically. The computing system may compute the first rate, the secondrate, and/or the third rate based at least in part on the obtainedchannel capacity. For example, when the channel capacity of the firstand/or second channel decreases, the system may compute the rates thatmaintain the distortion at the lowest expected level for the new channelcapacities. The system may additionally or alternatively, assignavailable channels to be the first channel, the second channel, and/orthe third channel based on the obtained channel capacities.

At block 270, the computing system may compress the common informationvector at the third rate R₀ to generate a third digital message. In someimplementations, the computing system may simultaneously compressinformation from a sequence of N pairs of distinct data vectors,resulting in N distinct common information vectors, including the commoninformation vector. For example, the common compressor 123 maysimultaneously compress N common information vectors, to generate thethird digital message of NR₀ bits.

At block 280, the computing system may route the first digital messagevia a first channel, the second digital message via a second channel andthe third digital message via a third channel. The first channel and thesecond channel may be private channels and the third channel may be apublic channel. The public channel may transmit information that isavailable and/or accessible to multiple receivers and/or decoders. Inother implementations, the information in the public channel isunavailable and/or inaccessible to the receivers and/or decoders otherthan the receivers and/or decoders in the network. The network may keepinformation from certain receivers and/or detectors by physicallyisolating signals in wired channels and/or by encrypting information andcontrolling the distribution of encryption keys. In someimplementations, the computing system encrypts information in thedigital messages for routing via the private channels and via the publicchannel at block 280. In such applications, multiple receivers and/ordecoders may obtain the encryption key for the third digital message,while the encryption keys for the first and second digital message mayonly be available to the corresponding receivers and/or decoders.

In some implementations, the computing system routes the digitalmessages at block 280 by storing the digital messages in one or morestorage media. The computing system may route the first digital messageby storing the first digital message at a first memory location of theone or more storage media. The system may route the second digitalmessage by storing the second digital messages at a second memorylocation of the one or more storage media. The system may route thethird digital message by storing the third digital message at a thirdmemory location of the one or more storage media. The storage media mayinclude magnetic hard drives, magnetic tapes, optical storage media,flash drives, and/or any other suitable storage media.

The first memory location and the second memory location may be securememory locations of the one or more storage media. In someimplementations, the secure memory locations are physically isolatedfrom memory reading devices other than the memory reading devicescommunicatively connected to specified receivers and/or detectors forreconstruction corresponding data. In other implementations, the securememory locations comprise encrypted memory for storing the privateinformation. The third memory location may be an unsecured memorylocation that is physically accessible to multiple receivers, decoders,and/or other units for reconstructing the data vectors. The unsecuredmemory location may contain unencrypted information or may containencrypted information with the encryption key available to multiplereceivers and/or decoders.

The following provides examples and mathematical derivations pertainingto the foregoing discussion.

The following discussion focuses on the mathematical principals appliedto source or data compression techniques. Source encoding refers to theconversion of data sequences, vectors, and/or sequences of vectorsgenerated by sources, referred to as “data” into binary sequences, bitstreams, and/or digital messages referred to as “messages” that are moreefficiently represented than the data, and more suitable fortransmission over a noiseless or noisy channel(s) to the decoder(s) thatreproduce(s) the original data, either exactly or approximately subject,for example, to a distortion criterion. Such techniques may be used intransmission of data from an optical disk to a music player, in computerapplications, such as, storing data, machine-to-machine interaction,etc., and general point-to-point and network applications. Incommunication applications it may be desirable to transmit compressedstreams of data which, can be reconstructed at the decoder (at leastapproximately) to their original form, subject to the pre-specifieddistortion or fidelity criterion. In computer applications, datacompression methods described herein may reduce storage requirementscompared to uncompressed data.

Data compression techniques may be divided into lossy and losslesscompression.

Lossless data compression techniques may be used when the bit streams ordigital messages generated by an encoder are used by a decoder toreproduce the original data with arbitrary small average probability oferror, asymptotically. Lossless data compression techniques ensure thatthe role of the encoder and decoder system is to remove redundancy,i.e., achieve efficient representation of the data, but not to removeinformation or distort the initial data.

Lossy data compression techniques allow a pre-specified distortion orfidelity between the original data and their reconstructions at thedecoder. They may be used in signal processing application which cantolerate information loss, such as, transmission and storage of digitalaudio and video data, etc., and in the design of communication systemsin which channels support limited rates. Lossy data compression systemrequirements for communication and storage applications may be morerelaxed than those of lossless data compression.

Network lossy and lossless compression techniques, also sometimes callednetwork source coding, are compression techniques for multiple sourcesthat may generate multiple data vectors and/or sequences, and may beimplemented with multiple encoders and or decoders. For a given a set ofsources that generate data and a given communication network, anefficient technique to compress the data for reliable transmission overthe network may take advantage of the correlations that exists in thedata generated by multiple sources. One example technique may removeredundancy that exists in the data, and introduce collaboration betweenthe encoders and decoders to improve performance, with respect tolimited capacity requirements, compared to those techniques that treatdata sequences and/or vectors independently and that do not introducecollaboration between encoders and decoders.

A code for network source coding may be set of functions that specifyhow data generated by sources are compressed, transmitted through thechannels and reconstructed at the receivers. In lossy network sourcecoding a code is called reliable if the data generated by the sourcescan be reconstructed, subject to satisfying an average distortion orfidelity at the intended receivers, asymptotically (in losslesscompression the average error converges asymptotically to zero). Theperformance and efficiency of the code is quantified by the rates atwhich the network sends data over the channels to the intendedreceivers. Substantially optimal codes achieve the smallest possiblerates and hence require the least channel capacity, when transported tothe receivers. A collection of channel rates, one for each channel, iscalled achievable if there exists a reliable code which ensuresoperation at these rates. The collection of all achievable channel ratesdenoted by

is called an achievable rate region of the network. The lower boundaryof

denoted by

, may give the fundamental limit to compare one code with another code,i.e., the performance of reliable codes. The achievable rate region

is only known for a small collection of networks. It is a standardpractice to characterize the minimum lossless and lossy compressionrates of data using Shannon's information theoretic measures of entropy,conditional entropy, mutual information, and the rate distortionfunction and conditional rate distortion function subject to a fidelitycriterion.

The derivations and discussion herein generally relate to algorithms andmethods for the lossy source coding network, known as the Gray-Wynernetwork, which has applications to network communication, computerstorage, and cyber security. Present principles develop algorithms andmethods to compute the (achievable) rate region denoted by

_(GW)(Δ₁, Δ₂) of the Gray-Wyner network, referred to as “Gray-Wynerlossy (achievable) rate region”, and its “lower boundary” denoted by

_(GW)(Δ₁, Δ₂), referred to as “Gray-Wyner lossy lower boundary of therate region” for sources that may generate two vectors and/or twosequences of multivariate data. The algorithms may be substantiallyoptimal for data that may be modeled by jointly Gaussian discrete-timeprocesses. The data may comprise a sequence of pairs or tuples of datavectors:

(Y ₁ ^(N) Y ₂ ^(N))=(Y _(1,1) , . . . ,Y _(1,N) ,Y _(2,1) , . . . ,Y_(2,N))

where

Y _(1,j) is a p ₁-dimensional vector, j=1, . . . ,N,  (2)

Y _(2,j) is a p ₂-dimensional vector, j=1, . . . ,N,  (3)

(Y _(1,j) ,Y _(2,j)) are independent of (Y _(1,i) ,Y _(2,j)) forj≠i,  (4)

(Y _(1,j) , Y _(2,j)) are correlated with Gaussian distribution P _(Y) ₁_(,Y) ₂ (y ₁ ,y ₂), j=1, . . . ,N  (5)

subject to square error distortion functions given by

$\begin{matrix}{{{D_{Y_{1}}\left( {y_{1}^{N},{\hat{y}}_{1}^{N}} \right)}\overset{\Delta}{=}{\frac{1}{N}{\sum\limits_{i = 1}^{N}{{y_{1,i} - {\hat{y}}_{1,i}}}^{2}}}},{{D_{Y_{2}}\left( {y_{2}^{N},{\hat{y}}_{2}^{N}} \right)}\overset{\Delta}{=}{\frac{1}{N}{\sum\limits_{i = 1}^{N}{{{y_{2,i} - {\hat{y}}_{2,i}}}^{2}.}}}}} & (6)\end{matrix}$

Present principles develop methods and algorithms to compute Wyner'slossy common information denoted by C_(W)(Y₁, Y₂; Δ₁, Δ₂) for sourcesthat generate data that may be modeled as two sequences of multivariatejointly Gaussian discrete-time processes.

Embodiments of this invention present methods and algorithms to computethe Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), the Gray-Wyner lossy lower boundary of the rate region

_(GW)(Δ₁, Δ₂), and Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁,Δ₂), using rate distortion theory, and more specifically, the ratedistortion function of a vector of data modeled by independent Gaussianrandom variables, subject to mean-square error (MSE) distortionfunctions, and which may be characterized by water-filling solutions.

In an embodiment, methods described herein may use Algorithm 1, givenbelow in detail, to parametrize all triples of data vectors that may bemodeled by Gaussian random variables of which the third makes theremaining two variables conditionally independent and of which the jointprobability distribution of the first two variables agrees with apre-specified joint probability distribution. Algorithm 1 may be apreliminary step to the computation of

_(GW)(Δ₁, Δ₂),

_(GW)(Δ₁, Δ₂), and C_(W)(Y₁, Y₂; Δ₁, Δ₂), and generation of messageswith rates that belong to

_(GW)(Δ₁, Δ₂),

_(GW)(Δ₁, Δ₂), and C_(W)(Y₁,Y₂; Δ₁, Δ₂).

Embodiments of the methods described herein may also use Algorithms 2and/or 3, given below in detail, to implement a pre-encoder that appliesa basis transformation to a tuple of sequences of vectors defined byEqs. (1)-(5) to transform them into the canonical variable form of thetuple, that is expressed in terms of canonical singular values of thetuple (pair of vectors). The basis transformation may be a nonsingulartransformation that may be equivalent to a pre-encoder that mapssequences of data (Y₁ ^(N), Y₂ ^(N)), into their canonical variableform. The elements and principles of the algorithm that generates thecanonical variable form of the tuple of vectors that may be modeled byGaussian random variables and/or Gaussian random processes. Thecanonical variable form may be used in the computation of the Gray-Wynerlossy rate region

_(GW)(Δ₁, Δ₂), the Gray-Wyner lossy lower boundary of the rate region

_(GW)(Δ₁, Δ₂), Wyner's lossy common information C_(W) (Y₁, Y₂; Δ₁, Δ₂),and generation of messages with rates that belong to

_(GW)(Δ₁, Δ₂),

GW (Δ₁, Δ₂), and C_(W)(Y₁, Y₂; Δ₁, Δ₂).

Embodiments of methods presented herein may also use Algorithm 3, givenbelow in detail, to employ the canonical variable form while computingthe Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), the Gray-Wyner lossy lower boundary of the rate region

_(GW)(Δ₁, Δ₂), and the Wyner's lossy common information C_(W)(Y₁, Y₂;Δ₁, Δ₂), of Eqs. (1)-(6), that is expressed in terms of the canonicalsingular values.

Algorithm 3 may include the computation of the rate triple (R₀, R₁, R₂)that lies is the Pangloss plane of the Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), using two distortion functions of independent Gaussianrandom variables with zero mean and variance represented by thecanonical singular values or canonical correlation coefficients, subjectto a mean square error (MSE) distortion criteria, and Wyner's lossycommon information. In another aspect, Algorithm 3 may represent asequence vectors modeled by jointly independent multivariate Gaussianrandom variables (Y₁ ^(N), Y₂ ^(N)) by a sequence of multivariatetriples modeled by independent Gaussian random variables (Z₁ ^(N), Z₂^(N), X^(N)), such that Y₁ ^(N) is expressed in terms of (Z₁ ^(N),X^(N)), and Y₂ ^(N) is expressed in terms of (Z₂ ^(N), X^(N)).

In another aspect, Algorithm 3 represents the reproductions (Ŷ₁ ^(N), Ŷ₂^(N)) of (Y₁ ^(N), Y₂ ^(N)), that satisfy the MSE fidelity criterion, bya triple of independent random variables ({circumflex over (Z)}₁^(N),{circumflex over (Z)}₂ ^(N),X^(N)), such that Ŷ₁ ^(N) is expressedin terms of ({circumflex over (Z)}₁ ^(N),X^(N)), and Ŷ₂ ^(N) isexpressed in terms of ({circumflex over (Z)}₂ ^(N), X^(N)).

In another aspect, Algorithm 3 presents a method based on the aboverepresentations of (Y₁ ^(N), Y₂ ^(N)) and (Ŷ₁ ^(N), Ŷ₂ ^(N)) from whichall rates that lie in the Gray-Wyner lossy rate region (R₀, R₁, R₂)∈

_(GW)(Δ₁, Δ₂), and the Gray-Wyner lossy lower boundary of the rateregion.

_(GW)(Δ₁, Δ₂), are computed by invoking rate distortion functions thatcorrespond to the rate distortion function of a vector of independentGaussian random variables, subject to MSE distortion, and variants ofit.

The embodiments of the invention may also use Algorithm 4, given belowin detail, to represent Ŷ₁ ^(N) at the output of a first pre-decoder,and the representation of Ŷ₂ ^(N) at the output of a second pre-decoder.Algorithm 4 may include two independent parts; the signal D^(1/2)X^(N)that is common to both, and their private parts that are realized byparallel additive Gaussian noise channels. It is then clear to thosefamiliar with the art of implementing compression or quantizationtechniques, such as, lattice codes, joint entropy coded ditheredquantization (ECDQ), and existing codes and quantization methods, thatthe pre-decoder reproduction sequences (Ŷ₁ ^(N), Ŷ₂^(N))={(Ŷ_(1,i),Ŷ_(2,i),)}_(i=1) ^(N), are such that (Ŷ_(1,i), Ŷ_(2,i))are jointly independent and identically distributed, and hence existingquantization techniques, such as, the ones described above, are readilyapplicable to produce the messages (W₀, W₁, W₂).

The present disclosure also provides analogous algorithms to Algorithms1-4 for time-invariant Gaussian processes, in which the two outputprocesses (Y₁ ^(N), Y₂ ^(N)) are generated by the representation

$\begin{matrix}{{{X\left( {t + 1} \right)} = {{{AX}(t)} + {{MV}(t)}}},{{X\left( t_{0} \right)} = X_{0}},} & (7) \\{{{Y(t)} = {\begin{pmatrix}{Y_{1}(t)} \\{Y_{2}(t)}\end{pmatrix} = {{\begin{pmatrix}C_{1} \\C_{2}\end{pmatrix}{X(t)}} + {\begin{pmatrix}N_{1} \\N_{2}\end{pmatrix}{V(t)}}}}},{C = \begin{pmatrix}C_{1} \\C_{2}\end{pmatrix}},{N = \begin{pmatrix}N_{1} \\N_{2}\end{pmatrix}},n,m_{v},p_{1},{p_{2}\mspace{14mu} {are}\mspace{14mu} {positive}\mspace{14mu} {integers}},{{X_{0}\text{:}\mspace{14mu} \Omega}->{\mathbb{R}}^{n}},{X_{0}\mspace{14mu} {is}\mspace{14mu} {zero}\mspace{14mu} {mean}\mspace{14mu} {{Gaussian}.i.e.}},{X_{0} \in {G\left( {0,G_{X_{0}}} \right)}},{{V\text{:}\mspace{14mu} \Omega \times T}->{\mathbb{R}}^{m_{v}}},{{V(t)} \in {G\left( {0,Q_{V}} \right)}},{Q_{V} = {Q_{V}^{T} \geq 0}},{T = \left\{ {0,1,2,\ldots}\mspace{14mu} \right\}},{A \in {\mathbb{R}}^{n \times n}},{M \in {\mathbb{R}}^{n \times m_{v}}},{C_{1} \in {\mathbb{R}}^{p_{1} \times n}},{C_{2} \in {\mathbb{R}}^{p_{2} \times n}},{N_{1} \in {\mathbb{R}}^{p_{1} \times m_{v}}},{N_{2} \in {\mathbb{R}}^{p_{2} \times m_{v}}},{{X\text{:}\mspace{14mu} \Omega \times T}->{\mathbb{R}}^{n}},{{Y_{1}\text{:}\mspace{14mu} \Omega \times T}->{\mathbb{R}}^{p_{1}}},{{Y_{2}\text{:}\mspace{14mu} \Omega \times T}->{{\mathbb{R}}^{p_{2}}.}}} & (8)\end{matrix}$

The white noise process V may represent a sequence of independent randomvariables. It is assumed that the white noise process V and the initialstate X₀ are independent objects. The assumptions of the algorithm belowthen imply the existence of an invariant distribution of the outputprocess Y which is Gaussian with distribution Y(t)∈G(0, Q_(Y)) for allt∈T.

The analogous algorithms to Algorithms 1-4 for representation (7) and(8) may be similar for the following two cases.

Case (1): Time-Invariant Stationary Gaussian Processes.

The assumptions included ensure that for the above Gaussian system thereexist the Kalman filter system and its innovation process, defined bythe realization:

{circumflex over (X)}(t+1)=A{circumflex over (X)}(t)+KV (t),  (9)

Y(t)=C{circumflex over (X)}(t)+ V (t), V (t)∈G(0,Q _(V) )  (10)

where V denotes the innovation process. The innovation process is asequence of independent random variables each of which has a Gaussianprobability distribution with V(t)∈G(0, Q_(V)). From this point onwardsanalogous algorithms to Algorithms 1-4 of the previous section can beobtained for the stationary innovations process V(t)∈G(0, Q_(V)).

Case (2): Time-Invariant Stationary Gaussian Processes.

The assumptions included ensure that for the above Gaussian system thereexists an invariant probability distribution of the state process whichis denoted by X(t)∈G(0, Q_(X)) where the matrix Q_(X)=Q_(X) ^(T)>0 isthe unique solution of the discrete Lyapunov equation,

Q _(X) =AQ _(X) A ^(T) +MQ _(V) M ^(T).  (11)

and an invariant probability distribution of the output processY(t)∈G(0, Q_(Y)) with

Q _(Y) =CQ _(X) C ^(T) +NQ _(V) N ^(T).  (12)

The tuple (Y₁, Y₂) of stationary jointly-Gaussian processes may thenhave an invariant probability distribution Y(t)∈G(0, Q_(Y)). From thispoint onwards analogous algorithms to Algorithms 1-4 of the previoussection can be obtained for the stationary jointly-Gaussian processes(Y₁, Y₂). One data source or two data sources may generate a pair ofdata sequences (Y₁ ^(N),Y₂ ^(N))

^({{Ŷ) _(1,i),Ŷ_(2,i)}_(i=1) ^(N)}, with jointly independent components,Y_(1,i), Y_(2,i) according to the joint probability distribution P_(Y)_(1,i) _(,Y) _(2,i) (y_(1,i), y_(2,i))

P_(Y1,i,Y2,i)(y_(1,i), y_(2,i)), where the components take values(Y_(1,i), Y_(2,i))=(y_(1,i), y_(2,i))∈Y₁×Y₂. The alphabet spaces Y₁ andY₂ can be finite sets, finite dimensional Euclidean spaces, or arbitraryspaces.

Lossy compression coding techniques may be consistent with thecommunication network 100 of FIG. 1, under the following requirements.

-   -   1. An encoder takes as is input the data sequences (Y₁ ^(N), Y₂        ^(N)) and produces at its output three messages (W₀, W₁, W₂),        there are three channels, Channel 0, Channel 1, Channel 2, with        capacities (C₀, C₁, C₂), respectively, to transmit the messages        to two decoders: Channel 0 is a public channel and channel 1 and        channel 2 are the private channels which connect the encoder to        each of the two decoders. Of the three messages, message W₀ is a        “common” or “public” message that is transmitted through the        public channel 0 with capacity C₀, to decoder 1 and decoder 2,        W₁ is a “private” message, which is transmitted though the        “private” channel 1 with capacity C₁, to decoder 1, and W₂ is a        “private” message, which is transmitted though the “private”        channel 2 with capacity C₂, to decoder 2.    -   2. Decoder 1 reproduces Y₁ ^(N) by Ŷ₂ ^(N) subject to an average        distortion and decoder 2 reproduces reproduce Y₂ ^(N) by Ŷ₂ ^(N)        subject to an average distortion, where (Y_(1,i),        Y_(2,i))=(y_(1,i), y_(2,i))∈Ŷ₁×Ŷ_(2,i)=1, . . . , N, and the        alphabet spaces Ŷ₁ and Ŷ₂ are arbitrary spaces.

The code of the Gray-Wyner source coding network, is denoted by theparameters (N, M₀, M₁, M₂, Δ_(Y1), Δ_(Y2)), and consists of thefollowing elements:

-   -   1. An encoder mapping f^((E)) which generates messages        (W₀,W₁,W₂) that take values in message sets W_(j)∈        _(j)        {1, . . . , |M_(i)|}, j=0, 1, 2, by

f ^((E))(Y ₁ ^(N) ,Y ₂ ^(N))=(W ₀ ,W ₁ ,W ₂).  (13)

-   -   2. A pair of decoder mappings (f_(Y) ₁ ^((D)),f_(Y) ₂ ^((D)))        which generates the reconstructions (Ŷ₁ ^(N),Ŷ₂ ^(N)) of (Y₁        ^(N),Y₂ ^(N)) by

Ŷ ₁ ^(N) =f _(Y) ₁ ^((D))(W ₀ ,W ₁)  (14)

Ŷ ₂ ^(N) =f _(Y) ₂ ^((D))(W ₀ ,W ₂)  (15)

-   -   3. An encoder-decoder average fidelity or distortion (Δ_(Y) ₁        ,Δ_(Y) ₂ ), where

Δ_(Y) ₁ =E{D _(Y) ₁ (Y ₁ ^(N) ,Ŷ ₁ ^(N))}, Δ_(Y) ₂ =E{D _(Y) ₂ (Y ₂ ^(N),Ŷ ₂ ^(N))}  (16)

-   -   and the non-negative distortion functions D_(Y) ₁ (.,.), D_(Y) ₂        (.,.) are single-letter given by

$\begin{matrix}{{{D_{Y_{1}}\left( {y_{1}^{N},{\hat{y}}_{1}^{N}} \right)} = {\frac{1}{N}{\sum\limits_{i = 1}^{N}{d_{Y_{1}}\left( {y_{1,i} - {\hat{y}}_{1,i}} \right)}^{2}}}},} & (17) \\{{D_{Y_{2}}\left( {y_{2}^{N},{\hat{y}}_{2}^{N}} \right)} = {\frac{1}{N}{\sum\limits_{i = 1}^{N}{{d_{Y_{2}}\left( {y_{2,i} - {\hat{y}}_{2,i}} \right)}^{2}.}}}} & (18)\end{matrix}$

-   -   There are two operational definitions of the code (N, M₀, M₁,        M₂, Δ_(Y) ₁ , Δ_(Y) ₂ ), the lossy and the loss-less. These are        defined below.    -   (a) The Gray-Wyner Lossy Rate Region. For any Δ₁≥0, Δ₂≥0, a        rate-triple (R₀,R₁,R₂) is said to be (Δ₁,Δ₂)—achievable if, for        arbitrary ε>0 and N sufficiently large, there exists a code (N,        ₀,        ₁,        ₂, Δ_(Y1), Δ_(Y) ₂ ) such that

M _(i)≤2^(N(R) ^(i) ^(÷ε)) , i=0,1,2,  (19)

Δ_(Y) ₁ ≤Δ₁+ε, Δ_(Y) ₁ ≤Δ₁+ε.  (20)

-   -   The set of all (Δ₁,Δ₂)—achievable rate-triples (R₀,R₁,R₂) is        called the achievable Gray-Wyner lossy rate region and it is        denoted by        _(GW)(Δ₁,Δ₂).    -   (b) Gray-Wyner Lossless Rate Region. Suppose {circumflex over        (γ)}₁=γ₁, {circumflex over (γ)}₂=γ₂ are sets of finite        cardinality, D_(Y) ₁ =D_(Y) ₂ , is the Hamming distance, and        Δ₁=Δ₂=0. Then        _(GW)(Δ₁,Δ₂) in (a) reduces to the achievable region of the        Gray-Wyner lossless rate region, and it is denoted by        _(GW).        The convex closure of the rate triples may be completely defined        by the lower boundary of the rate region:

_(GW)(Δ₁,Δ₂)={(R ₀ ,R ₁ ,R ₂)∈

_(GW)(Δ₁,Δ₂):({circumflex over (R)} ₀ ,{circumflex over (R)} ₁,{circumflex over (R)} ₂)∈

_(GW)(Δ₁,Δ₂), {circumflex over (R)} ₀ ≤R _(i),(i=0,1,2)→{circumflex over(R)} _(i) =R _(i)(i=0,1,2)}.  (21)

A given a pair of jointly independent sequences (Y₁ ^(N), Y₂ ^(N)),generated from the joint probability distribution (Y_(1,i),Y_(2,i))˜P_(Y) ₁ _(,Y) ₂ (y₁,y₂)≡p_(Y) ₁ _(,Y) ₂ , i=1, . . . , N, and anetwork with capacity triple (C₀, C₁, C₂), then the sequences Y₁ ^(N)and Y₂ ^(N) can be reliably reconstructed by Ŷ₁ ^(N) and Ŷ₂ ^(N),respectively, subject to an average distortion of less than or equal toΔ₁ and Δ₂ respectively, if and only if the capacity triple (C₀, C₁, C₂)lies above

_(GW)(Δ₁, Δ₂). Hence,

_(GW)(Δ₁, Δ₂) defines those capacity triples (C₀, C₁, C₂) which arenecessary and sufficient for reliable communication over channel 0,channel 1, and channel 2.

To achieve any rate triple (R₀, R₁, R₂) that lies on the Gray-Wynerlower boundary of the rate region

_(GW)(Δ₁, Δ₂), the capacity of channel 0 (C₀) should be prioritized andused as the information “common” to both Y₁ ^(N) and Y₁ ^(N).Accordingly, one may design a coding scheme based on the identificationof a sequence of random variables W^(N), called “auxiliary” randomvariables that represent the information transmitted over the commonchannel 0, and they showed that any rate triple (R₀, R₁, R₂)∈

_(GW)(Δ₁, Δ₂) can be achieved by optimizing over the choice of randomvariables W^(N), i.e., the choice of joint distributions P_(Y) ₁ _(N)_(,Y) ₂ _(N) _(,W) _(N) (y₁ ^(N), y₂ ^(N), w^(N)) having marginal P_(Y)₁ _(N) _(,Y) ₂ _(N) (y₁ ^(N), y₂ ^(N))

The characterization of the (achievable) Gray-Wyner rate region

_(GW)(Δ₁, Δ₂) and the Gray-Wyner (achievable) lower boundary of the rateregion

_(GW)(Δ₁, Δ₂) may be determined by invoking Shannon's informationtheoretic measures, and specifically, the rate distortion theory.

A pair of jointly independent sequences (Y₁ ^(N), Y₂ ^(N)), generatedfrom the joint probability distribution (Y_(1,i), Y_(2,i))˜P_(Y) ₁ _(,Y)₂ (y₁, y₂)≡P_(Y) ₁ _(,Y) ₂ , i=1, . . . , N may be represented as atriple. Joint sequences (Y₁ ^(N), Y₂ ^(N), W^(N)) generated from thejoint probability distribution (Y_(1,i), Y_(2,i), W_(i))˜P_(Y) ₁ _(,Y) ₂_(,W)(y₁,y₂,w)≡P_(Y) ₁ _(,Y) ₂ _(,W), i=1, . . . , N that havemarginal(Y_(1,i), Y_(2,i))˜P_(Y) ₁ _(,Y) ₂ _(,W)(Y₁, Y₂, ∞)≡P_(Y) ₁_(,Y) ₂ , i=1, . . . , N, are jointly independent random variables.

The characterization of the Gray-Wyner rate region

_(GW)(Δ₁, Δ₂) and the Gray-Wyner

lower boundary of the rate region

_(GW)(Δ₁, Δ₂) in terms of rate distortion theory is described below.

The joint rate distortion function may be defined by:

$\begin{matrix}{{R_{Y_{1},Y_{2}}\left( {\Delta_{1},\Delta_{2}} \right)} = {\inf\limits_{{{P_{{\hat{Y}}_{1},{{\hat{Y}}_{2}|Y_{1}},Y_{2}}\text{:}E{\{{d_{Y_{1}}{({Y_{1},{\hat{Y}}_{1}})}}\}}} \leq \Delta_{1}},{{E{\{{d_{Y_{2}}{({Y_{2},{\hat{Y}}_{2}})}}\}}} \leq \Delta_{2}}}{I\left( {Y_{1},{Y_{2};{\hat{Y}}_{1}},{\hat{Y}}_{2}} \right)}}} & (22)\end{matrix}$

where I(Y₁, Y₂; Y₁, Y₂) is the mutual information between (Y₁, Y₂) and(Ŷ₁, Ŷ₂), and the expectation is taken with respect to the jointdistribution P_(Y1,Y2,Ŷ1,Ŷ2)=P_(Ŷ1,Ŷ2,Y1,Y2)P_(Y1,Y2). Theinterpretation is that R_(Y) ₁ _(,Y) ₂ (Δ₁, Δ₂) is the minimum rate ofreconstructing (Y₁ ^(N), Y₂ ^(N)) by (Ŷ₁ ^(N), Ŷ₂ ^(N)) subject to anaverage distortion of less than or equal to (Δ₁, Δ₂) for sufficientlylarge N.

Similarly, the marginal rate distortion functions may be defined by:

$\begin{matrix}{{{R_{Y_{1}}\left( \Delta_{1} \right)} = {\inf\limits_{{P_{{\hat{Y}}_{1}|Y_{1}}\text{:}E{\{{d_{Y_{1}}{({Y_{1},{\hat{Y}}_{1}})}}\}}} \leq \Delta_{1}}{I\left( {Y_{1};{\hat{Y}}_{1}} \right)}}},} & (23) \\{{R_{Y_{2}}\left( \Delta_{2} \right)} = {\inf\limits_{{P_{{\hat{Y}}_{2}|Y_{2}}\text{:}E{\{{d_{Y_{2}}{({Y_{2},{\hat{Y}}_{2}})}}\}}} \leq \Delta_{2}}{{I\left( {Y_{2};{\hat{Y}}_{2}} \right)}.}}} & (24)\end{matrix}$

Another rate distortion function which may play a role in the formaldescription of

_(GW)(Δ₁, Δ₂), is the conditional rate distortion function. Assumingp_(Y1,Y2,W)(y₁,y₂,w) is a joint probability distribution on Y₁×Y₂×W,where W is arbitrary, such that the Y₁×Y₂ marginal probabilitydistribution satisfies P_(Y1,Y2,W)(y₁,y₂,∞)=P_(Y1,Y2)(y₁,y₂), theconditional rate distortion function of random variables (Y₁, Y₂)conditioned on the random variable W is defined by:

$\begin{matrix}{{R_{Y_{1},{Y_{2}|W}}\left( {\Delta_{1},\Delta_{2}} \right)} = {\inf\limits_{{{P_{{\hat{Y}}_{1},{{\hat{Y}}_{2}|Y_{1}},Y_{2},W}\text{:}E{\{{d_{Y_{1}}{({Y_{1},{\hat{Y}}_{1}})}}\}}} \leq \Delta_{1}},{{E{\{{d_{Y_{2}}{({Y_{2},{\hat{Y}}_{2}})}}\}}} \leq \Delta_{2}}}{I\left( {Y_{1},{Y_{2};{\hat{Y}}_{1}},\left. {\hat{Y}}_{2} \middle| W \right.} \right)}}} & (25)\end{matrix}$

where I(Y₁, Y₂; Ŷ₁, Ŷ₂|W) is the conditional mutual information between(Y₁, Y₂) and (Ŷ₁, Ŷ₂) conditioned on W. Consequently, R_(Y) ₁ _(,Y) ₂_(|W)(Δ₁, Δ₂) is the minimum rate of reconstructing (Y₁ ^(N), Y₂ ^(N))by (Ŷ₁ ^(N),Ŷ₂ ^(N)) subject to an average distortion of less than orequal to (Δ₁, Δ₂) for sufficiently large N, when both encoders anddecoders know W^(N).

Similarly, the conditional marginal rate distortion functions may bedefined by:

$\begin{matrix}{{{R_{Y_{1}|W}\left( \Delta_{1} \right)} = {\inf\limits_{{P_{{{\hat{Y}}_{1}|Y_{1}},W}\text{:}E{\{{d_{Y_{1}}{({Y_{1},{\hat{Y}}_{1}})}}\}}} \leq \Delta_{1}}{I\left( {Y_{1};\left. {\hat{Y}}_{1} \middle| W \right.} \right)}}},} & (26) \\{{R_{Y_{2}|W}\left( \Delta_{2} \right)} = {\inf\limits_{{P_{{{\hat{Y}}_{2}|Y_{2}},W}\text{:}E{\{{d_{Y_{2}}{({Y_{2},{\hat{Y}}_{2}})}}\}}} \leq \Delta_{2}}{{I\left( {Y_{2};\left. {\hat{Y}}_{2} \middle| W \right.} \right)}.}}} & (27)\end{matrix}$

The following lower bound on the Gray-Wyner lossy rate

_(GW)(Δ₁, Δ₂) may be defined for the processes defined by Eqs. (1)-(6):

If(R ₀ ,R ₁ ,R ₂)∈

_(GW)(Δ₁,Δ₂), then

R ₀ +R ₁ +R ₂ ≥R _(Y) ₁ _(,Y) ₂ (Δ₁,Δ₂).  (28)

R ₀ +R ₁ ≥R _(Y) ₁ (Δ₁),  (29)

R ₀ +R ₂ ≥R _(Y) ₂ (Δ₂).  (30)

The inequality in Eq. 28 may be called the “Pangloss Bound” of theGray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂). The set of triples (R₀, R₁, R₂)∈

_(GW)(Δ₁, Δ₂) that satisfy the equality R₀+R₁+R₂=R_(Y) ₁ _(,Y) ₂ (Δ₁,Δ₂) for some joint distribution for some joint distribution P_(Y) ₁_(,Y) ₂ _(,W) (Y₁, Y₂, w) may be called the “Pangloss Plane” of theGray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), and corresponds to the case when the auxiliary RV W orcommon information is prioritized.

A coding scheme which uses the auxiliary RV W to represent theinformation transported over channel 0 may achieve any rate triple (R₀,R₁, R₂)∈

_(GY)(Δ₁, Δ₂) by optimizing over the choice of W. Processes defined byEqs. (1)-(6) define the family of probability distributions:

{P _(Y) ₁ _(,Y) ₂ _(,W)(y ₁ ,y ₂ ,w),y ₁∈γ₁ ,y ₂∈γ₂ ,w∈

:P _(Y) ₁ _(,Y) ₂ _(,W)(y ₁ ,y ₂,∞)=P _(Y) ₁ _(,Y) ₂ _(,W)(y ₁ ,y ₂)}  (31)

for some RV W, i.e., such that the joint probability distribution P_(Y)₁ _(,Y) ₂ _(,W)(y₁,y₂,w) on Y₁×Y₂×W, has (Y₁, Y₂)—marginal probabilitydistribution P_(Y) ₁ _(,Y) ₂ (y₁, y₂) on Y₁×Y₂ that coincides with theprobability distribution of (Y₁, Y₂).

Each P_(Y) ₁ _(,Y) ₂ _(,W) in the set of Eq. 31 defines three RVs (Y₁,Y₂, W) as described above.

Suppose there exists ŷ₁∈γ₁ and {circumflex over (γ)}₂∈γ₂ such thatE{d_(Y) _(i) (Y_(i),ŷ_(i))}<∞, i=1, 2.For each P_(Y) ₁ _(,Y) ₂ _(,w)∈

and Δ₁≥0, Δ₂≥0, define the subset of Euclidean 3-dimensional space

GW P Y 1 , Y 2 , W  ( Δ 1 , Δ 2 ) = { ( R 0 , R 1 , R 2 )  :   R 0 ≥I  ( Y 1 , Y 2 ; W ) , R 1 ≥ R Y 1  W  ( Δ 1 ) , R 2 ≥ R Y 2  W  (Δ 2 ) } . ( 32 )Let

GW *  ( Δ 1 , Δ 2 ) = ( ⋃ P Y 1 , Y 2  W ∈   GW P Y 1 , Y 2 , W  Δ1 , Δ 2 ) c ( 33 )

-   -   where (.)^(c) the closure of the set. Then the achievable        Gray-Wyner lossy rate region is given by

_(GW)(Δ₁,Δ₂)=

_(GW)(Δ₁,Δ₂).

It follows that

_(GW)(Δ₁, Δ₂) is completely described by a single coding scheme thatuses the RV W. Moreover, the following statements hold:

(1) Since

_(GW)(Δ₁, Δ₂) is convex

_(GW)(Δ₁, Δ₂) is also convex.(2) The family of distributions in the set of Eq. 31 corresponds thefamily of “test distributions” P_(T) defined by:

_(T)

{P _(W|Y) ₁ _(,Y) ₂ (w|y ₁ ,y ₂),w∈

,y ₁∈γ₁ ,y ₂∈γ₂ :P _(W,Y) ₁ _(,Y) ₂ (y ₁ ,y ₂ ,w)=P _(W|Y) ₁ _(,Y) ₂ (y₁ ,y ₂ ,w)P _(W|Y) ₁ _(,Y) ₂ (y ₁ ,y ₂ ,w)P _(Y) ₁ _(,Y) ₂ _(,W)(y ₁ ,y₂,∞)=P _(Y) ₁ _(,Y) ₂ (y ₁ ,y ₂)}.   (35)

(3) The Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂) and its lower boundary

_(GW)(Δ₁, Δ₂) may be determined from

$\begin{matrix}{{T\left( {\alpha_{1},\alpha_{2}} \right)} = {\inf\limits_{P_{Y_{1},{Y_{2}W}} \in }\left\{ {{I\left( {Y_{1},{Y_{2};W}} \right)} + {\alpha_{1}{R_{Y_{1}|W}\left( \Delta_{1} \right)}} + {\alpha_{2}{R_{Y_{2}|W}\left( \Delta_{2} \right)}}} \right\}}} & (36)\end{matrix}$

where (α₁, α₂) are arbitrary real numbers. If S(α₁, α₂) is the set of(R₀, R₁, R₂)∈

_(GW)(Δ₁, Δ₂) that achieve T(α₁, α₂)≡T(α₁, α₂, R₀, R₁, R₂) then

_(GW)(Δ₁, Δ₂) is a subset of U_((α) ₁ _(,α) ₂ _():0≤α) ₁ _(,α) ₂ _(≤1,α)₁ _(+α) ₂ _(≥1)S(α₁, α₂).

An operational definition of the common information between twocorrelated random variables taking values in finite alphabet spaces maybe used. Suppose the data source generates a pair of sequences (Y₁ ^(N),Y₂ ^(N)), with values (Y_(1,i), Y_(2,i))=(Y_(1,i), Y_(2,i))∈Y₁×Y₂, i=1,. . . , N, where the alphabet spaces Y₁ and Y₂ are finite sets, and thesequences are jointly independent and identically distributed. Theinterpretation of Wyner's common information between sequences Y₁ ^(N)and Y₂ ^(N) may also described using the Gray-Wyner Network, and theoperational definition given above. Wyner's common information betweenY₁ ^(N) and Y₂ ^(N) is the smallest R₀ such that (R₀, R₁, R₂)∈

_(GW) and lies on the Pangloss Plane, for some R₁, R₂.

The characterization of Wyner's common information may be expressed interms of Shannon's information theoretic measures. More particularly,the Wyner's common information between two sequences Y₁ ^(N) and Y₂ ^(N)which are jointly independent and identically distributed (Y_(1,i),Y_(2,i))˜P_(Y) ₁ _(,Y) ₂ (y₁, y₂), i=1, . . . , N, may be given by:

$\begin{matrix}{{C_{W}\left( {Y_{1},Y_{2}} \right)} = {\inf\limits_{P_{Y_{1},Y_{2},W}\text{:}P_{{{Y_{1},Y_{2}}}W}P_{{Y_{2}}W}}{I\left( {Y_{1},{Y_{2}\text{;}W}} \right)}}} & (37)\end{matrix}$

where P_(Y) ₁ _(,Y) ₂ _(,W)(y₁, y₂, w) is any joint probabilitydistribution on Y₁×Y₂×W with (Y₁, Y₂)—marginal P_(Y) ₁ _(,Y) ₂ (y₁, y₂).Equivalently, C_(W)(Y₁, Y₂) the infimum (or greatest lower bound) of themutual information I(Y₁, Y₂; W) over all triple of random variable (Y₁,Y₂, W) satisfying:

$\begin{matrix}{\mspace{79mu} \left( {W1} \right)} & \; \\{\mspace{79mu} {{\sum\limits_{w \in W}P_{Y_{1},Y_{2},{W{({y_{1},y_{2},w})}}}} = {P_{Y_{1},{Y_{2}{({y_{1},y_{2}})}}}.}}} & (38) \\{\mspace{79mu} \left( {W2} \right)} & \; \\{{Randmom}\mspace{14mu} {Variables}\mspace{14mu} Y_{1}\mspace{14mu} {and}\mspace{14mu} Y_{2}\mspace{14mu} {are}\mspace{14mu} {Conditionally}\mspace{14mu} {Independent}\mspace{14mu} {Given}\mspace{14mu} {Random}\mspace{14mu} {Variable}\mspace{14mu} {W.}} & (39)\end{matrix}$

Another operational definition of the Wyner's lossy common informationbetween two correlated random variables taking values in arbitraryspaces may be used. For example, Wyner's lossy common informationbetween (Y₁ ^(N), Y₂ ^(N)) may be the smallest R₀ such that (R₀, R₁,R₂)∈

_(GW)(Δ₁, Δ₂) and lies on the Pangloss Plane, for some R₁, R₂. Theprecise definition of this example is the following:

Consider a Gray-Wyner network. For any Δ₁≥0, Δ₂≥0, a number R₀ is saidto be (Δ₁, Δ₂)—achievable if for arbitrary ε>0 and N sufficiently large,there exists a code (N, M₀, M₁, M₂, Δ_(Y) ₁ , Δ_(Y) ₂ ) such that

$\begin{matrix}{{M_{0} \leq 2^{{NR}_{0}}},} & (40) \\{{{\sum\limits_{i = 0}^{2}\; {\frac{1}{N}{\log M}_{i}}} \leq {{R_{Y_{1}Y_{2}}\left( {\Delta_{1},\Delta_{2}} \right)} + ɛ}},} & (41) \\{{\Delta_{Y_{1}} \leq {\Delta_{1} + ɛ}},\; {{\Delta_{Y}}_{1} \leq {\Delta_{1} + {ɛ.}}}} & (42)\end{matrix}$

The infimum of all R₀ that are (Δ₁, Δ₂) achievable for some rates R₁,R₂, denoted by C_(W)(Y₁, Y₂; Δ₁, Δ₂) may be called Wyner's lossy commoninformation between Y₁ ^(N) and Y₂ ^(N).

The sum rate bound in Eq. (41) may be called the Pangloss Bound of thelossy Gray-Wyner network. That is, C_(W)(Y₁, Y₂; Δ₁, Δ₂) is the minimumcommon message rate for the Gray-Wyner network with sum rate R_(Y) ₁_(,Y) ₂ (Δ₁, Δ₂) while satisfying the average distortion constraints.

A characterization of Wyner's lossy common information that is expressedin terms of Shannon's information theoretic measures may be provided.For example, the Wyner's lossy common information between two sequencesY₁ ^(N) and Y₂ ^(N) which are jointly independent and identicallydistributed (Y_(1,i), Y_(2,i))˜P_(Y) ₁ _(,Y) ₂ (y₁, y₂), i=1, . . . , N,is given by:

C _(W)(Y ₁ , Y ₂;Δ₁,Δ₂)=inf{I(Y ₁ ,Y ₂ ;W):P _(Y) ₁ _(,Y) ₂ _(,Ŷ) ₁_(,Ŷ) ₂ _(,W) satisfies (W3)-(W7)}  (43)

where:

(W3). P _(Y) ₁ _(,Y) ₂ _(,Ŷ) ₁ _(,Ŷ) ₂ _(,W)(y ₁ ,y ₂,∞,∞,∞)=P _(Y) ₁_(,Y) ₂ (y ₁ ,y ₂),  (44)

(W4). random variables Ŷ ₁ and Ŷ ₂ are conditionlly independent givenrandom variable W,  (45)

(W5). random variables (Y ₁ ,Y ₂) and W are conditionally independentgiven random variables (Ŷ ₁ ,Ŷ ₂),  (46)

(W6). P _(Y) ₁ _(,Y) ₂ _(,Ŷ) ₁ _(,Ŷ) ₂ _(,W)(y ₁ ,y ₂ ,ŷ ₁ ,ŷ ₂ ,w)=P*_(Y) ₁ _(,Ŷ) ₂ _(|Y) ₁ _(,Y) ₂ _(,W)(ŷ ₁ ,ŷ ₂ |y ₁ ,y ₂ ,w)P _(Y) ₁_(,Y) ₂ (y ₁ ,y ₂),  (47)

(W7). P* _(Y) ₁ _(,Ŷ) ₂ _(|Y) ₁ _(,Y) ₂ (ŷ ₁ ,ŷ ₂ |y ₁ ,y ₂ ,w) is thedistribution corresponding to rate distortion function R _(Y) ₁ _(,Y) ₂(Δ₁,Δ₁).  (48)

An alternative characterization of Wyner's lossy common informationC_(W)(Y₁, Y₂; Δ₁, Δ₂) is the following:

If there exists ŷ₁∈Ŷ₁ and ŷ₂∈Ŷ₂ such that E{d_(Y) _(i) (Y_(i),ŷ_(i))}<∞,i=1, 2, then Wyner's common information C_(W)(Y₁, Y₂; Δ₁, Δ₂) is givenby

inf I(Y ₁ ,Y ₂ ;W)  (49)

such that the identity holds

R _(Y) ₁ (Δ₁)+R _(Y) ₂ _(|W)(Δ₂)+I(Y ₁ ,|Y ₂ ;W)=R _(Y) ₁ _(,Y) ₂(Δ₁,Δ₂).  (50)

Additional or alternative characterizations of Wyner's commoninformation or Wyner's lossy common information are possible.

Generally, the present disclosure may be applied to methods andalgorithms to compute the Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), its lower boundary

_(GW)(Δ₁, Δ₂), and Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁,Δ₂), for two sequences {(Y_(1,i), Y_(2,i))}_(i=1) ^(N) of jointlyindependent, Gaussian distributed multivariate vectors with square errorfidelity, as defined by (1)-(6), and analogous expressions for therepresentation (7) and (8). The methods and algorithms may also apply tomultivariate vectors with other distributions. For the processes definedby (1)-(6), the methods and algorithms include computations of

_(GW)(Δ₁, Δ₂),

_(GY)(Δ₁, Δ₂), and C_(W)(Y₁, Y₂; Δ₁, Δ₂) using the various informationmeasures discussed above. For the processes defined by representation(7) and (8) the methods and algorithms are analogous to those ofprocesses (1)-(6), but the information measures are replaced by those ofstationary processes.

Algorithms 1-4 discussed are now briefly discussed in relation to ratedistortion theory. The disclosure describes a systematic networkcompression encoder and decoder system for data sources that generatetwo streams of correlated multivariate vectors that may be modeled asGaussian sequences, that preserves the low complexity of standardnon-network coding while operating near the Shannon performance limit asdefined by the Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), its lower boundary

_(GW)(Δ₁, Δ₂), and Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁,Δ₂), and described above, and their generalizations to stationaryprocesses.

Embodiments of the discussed algorithms, may be implemented incommunication systems of transmitting correlated multivariate Gaussiansequences, in mass storage devices and in other systems that will beapparent to those of ordinary skill in the art.

Accordingly, a network compression encoding assembly may be implementedby an electrical circuit and/or by a processor with access to a computerreadable storage medium storing instructions that are executable by theprocessor to implement a network coding method with a systematic encoderthat efficiently compresses the data prior to transmission.

Some aspects of Algorithm 3 are presented below to compute the ratetriple (R₀, R₁, R₂) that lies is the Pangloss plane of the Gray-Wynerlossy rate region

_(GW)(Δ₁, Δ₂), using the canonical variable form of processes (1)-(6),and the rate distortion functions of independent Gaussian randomvariables with zero mean and variance represented by the canonicalsingular values, subject, for example, to a mean square error (MSE)distortion criteria.

First, a basis transformation may be applied to a tuple of vectorsequences defined by (1)-(5) to transform them into the canonicalvariable form of the tuple, that is expressed in terms of canonicalsingular values of the tuple. The basis transformation is a nonsingulartransformation that is equivalent to a pre-encoder that maps sequencesof data (Y₁ ^(N), Y₂ ^(N)) into their canonical variable form.

The rate triple (R₀, R₁, R₂) that lies in the Pangloss plane of theGray-Wyner lossy rate region may be computed from two rate distortionfunctions of n independent Gaussian random variables with zero mean andvariance (1−d_(i))∈(0,1), where d_(i) are the canonical singular valuesor canonical correlation coefficients, subject to MSE distortionfunctions, for i=1, . . . , n, for a specific pre-computed dimensionn≤min {p₁, p₂}. These are given by:

$\begin{matrix}{{R_{1} \geq {R_{{Y_{1}}W}\left( \Delta_{1} \right)}} = {{R_{Z_{1}}\left( \Delta_{1} \right)} = {\frac{1}{2}{\sum\limits_{j = 1}^{n}\; {\log\left( {{\frac{\left( {1 - d_{j}} \right)}{\Delta_{1,j}} = {\sum\limits_{j = 1}^{n}\; {R_{Z_{1,j}}\left( \Delta_{1,j} \right)}}},} \right.}}}}} & (51) \\{\mspace{79mu} {{where}\mspace{14mu} \mspace{20mu} {\Delta_{1,j} = \left\{ {\begin{matrix}{\lambda,} & {\lambda \leq {1 - d_{j}}} \\{{1 - d_{j}},} & {\lambda \geq {1 - d_{j}}}\end{matrix},{{{and}\;  {\overset{n}{\sum\limits_{j = 1}}\; \Delta_{1,j}}} = \Delta_{1}},\mspace{20mu} {\Delta_{1} \in \left( {0,\infty} \right)}} \right.}}} & (52) \\{{R_{2} \geq {R_{{Y_{2}}W}\left( \Delta_{2} \right)}} = {{R_{Z_{2}}\left( \Delta_{2} \right)} = {\frac{1}{2}{\sum\limits_{j = 1}^{n}\; {\log\left( {{\frac{\left( {1 - d_{j}} \right)}{\Delta_{2,j}} = {\sum\limits_{j = 1}^{n}\; {R_{Z_{2,j}}\left( \Delta_{2,j} \right)}}},} \right.}}}}} & (53) \\{\mspace{79mu} {{where}\mspace{14mu} \mspace{85mu} {\Delta_{2,j} = \left\{ {\begin{matrix}{\lambda,} & {\lambda \leq {1 - d_{j}}} \\{{1 - d_{j}},} & {\lambda \geq {1 - d_{j}}}\end{matrix},{{{and}\mspace{14mu} \mspace{95mu} {\sum\limits_{j = 1}^{n}\; \Delta_{2,j}}} = \Delta_{2}},\mspace{14mu} {\Delta_{2} \in {\left( {0,\infty} \right).}}} \right.}}} & (54)\end{matrix}$

If R₀ is such that (R₀, R₁, R₂)∈

_(GW)(Δ₁, Δ₂) and lies on the Pangloss Plane, then:

$\begin{matrix}{R_{0} = {{I\left( {Y_{1},{Y_{2}\text{;}W}} \right)} = {{{H^{cvf}\left( {Y_{1},Y_{2}} \right)} - {H\left( Z_{1} \right)} - {H\left( Z_{2} \right)}} = {\frac{1}{2}{\sum\limits_{i = 1}^{n}{\log \left( \frac{1 + d_{i}}{1 - d_{i}} \right)}}}}}} & (55)\end{matrix}$

where H^(cvf)(Y₁, Y₂) us the differential entropy of (Y₁, Y₂) expressedin canonical variable form, and H(Z_(i)) is the differential entropy ofZ_(i), i=1, 2.

Moreover, (R₁, R₂, R₀) may satisfy:

R ₀ +R ₁ +R ₂ ≥R _(Y) ₁ _(,Y) ₂ ≥(Δ₁,Δ₂)=I(Y ₁ ,Y ₂ ;W)+R _(Y) ₁_(|W)(Δ₁)+R _(Y) ₂ _(|W)(Δ₂).  (56)

and (Δ₁, Δ₂) must lie in a region specified by the validity of Eq. 56.

The above techniques may be used to generate messages (W₁, W₁, W₂), andto compute the achievable region for the lossy Gray-Wyner network, i.e.,

_(GW)(Δ₁, Δ₂), the smallest R₀ such that (R₀, R₁, R₂)∈

_(GW)(Δ₁, Δ₂) and lies on the Pangloss Plane, for some R₁, R₁, and tocompute Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁, Δ₂).

Of the three messages (W₀, W₁, W₂), the encoder may transmit message W₀over a noiseless channel of capacity R₀, to two decoders. Additionally,the encoder may transmit the private message W₁ over n independentparallel noiseless channels each of rate R_(z) _(1,j) (Δ_(1,j)) to onedecoder, and the encoder transmits the private message W₂ over nindependent parallel noiseless channels each of rate R_(Z) _(2,j)(Δ_(2,j)) to another decoder. The first decoder may reproduce Y₂ ^(N) byŶ₁ ^(N) and subject to a MSE distortion and the second decoder mayreproduce Y₂ ^(N) by Ŷ₂ ^(N) subject to a MSE distortion.

The encoder and decoder units may include a paralleled hardwareimplementation to minimize computational complexity and to achieve theoptimal rates that operate at the “Pangloss Plane.” The rate triples(R₀, R₁, R₂), may be computed using a so-called water-filling algorithm.

Some aspects of Algorithm 4 may involve a representation of Ŷ₁ ^(N) atthe output of one pre-decoder, and a representation of Ŷ₂ ^(N) at theoutput of another pre-decoder, each consists of two independent parts;the signal D^(1/2)X that is common to both, and their private parts thatmay be realized by parallel additive Gaussian noise channels, with therates described above.

A variety of compression and quantization techniques such as, jointentropy coded dithered quantization (ECDQ) and lattice codes, etc., thatthe pre-decoder reproduction sequences (Ŷ₁ ^(N), Ŷ₂ ^(N))={(Ŷ_(1,i),Ŷ_(2,i))}_(i=1) ^(N), where (Ŷ_(1,i), Ŷ_(2,i)) are jointly independentand distributed according to Algorithm 4, may be used to produce themessages (W₁, W₂, W₀). For example, ECDQ, lattice codes techniques, maybe used to construct the compressed messages (W₁, W₂, W₀).

The following describes methods and algorithms that may be implementedto compute the Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), its lower boundary

_(GW)(Δ₁, Δ₂), and Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁,Δ₂), for two sequences {(Y_(1,i), Y_(2,i))}_(i=1) ^(N) of jointlyindependent, Gaussian distributed multivariate vectors with square errorfidelity, as defined by (1)-(6), and for two output processes (Y₁ ^(N),Y₂ ^(N)) that are generated by the representations as described above.The methods and algorithms include computations of

_(GW)(Δ₁, Δ₂),

_(GW)(Δ₁, Δ₂), and information C_(W)(Y₁, Y₂; Δ₁, Δ₂) using theinformation measures of described above.

In accordance with established convention in mathematics, andprobability, the following definitions may be used:

₊

({1, 2, . . . ,}: the set of the positive integers.

{(0, 1, 2, . . . ,}: the set of natural integers.

_(n)

{1, 2, . . . , n}: finite subset of the above defined set for n∈

₊.

_(n)

{0, 1, 2, . . . , n}: finite subset of the above defined set for n∈

₊.

(−∞, ∞): the set of real numbers.

₊

[0, ∞): the set of real positive numbers.(0, ∞)⊂

: the set of strictly positive real numbers.

^(n): the vector space of n-tuples of real numbers.

^(n×m): the set of n by m matrices with elements in the real numbers,for n, m∈

₊.Q≥0: a symmetric positive semidefinite matrix Q∈

^(n×n).Q>0: a symmetric positive definite matrix Q∈

^(n×n).(Ω,

,

): a probability space consisting of a set Ω, a σ-algebra

of the subsets of Ω, and a probability measure

→

[0,1].X: Ω→

: a real-valued random variable defined on some probability space (Ω,

,

).

^(X): the σ-algebra (set of all possible events) generated by a randomvariable X: Ω→

with values in some arbitrary space

(i.e., finite set,

^(n), etc.).

[X]: the expectation of some random variable X defined on (Ω,

,

).Two random variables (RVs) X₁: Ω→

₁ and X₂: Ω→

₂ defined (Ω,

,

), with joint probability distribution P_(X) ₁ _(X) ₂ (X₁,X₂) andmarginal probability distributions P_(X) ₁ (x₁)P_(X) ₂ (x₂) are calledindependent if P_(X) ₁ _(X) ₂ (x₁,x₂)=p_(X) ₁ (x₁)P_(X) ₂ (x₂). A

^(n)—valued Gaussian RV with parameters of the mean value m_(x)∈

^(n) and the covariance matrix Q_(X)∈

^(n×n), Q_(X)=Q_(X) ^(T)≥0, where Qx is a function X: Ω→

^(n) (which is a RV) and such that the probability distribution of thisRV equals a Gaussian distribution described by its characteristicfunction:

E[exp(iu ^(T) X)]=exp(iu ^(T) m _(X)−½Q _(X) u),∀u∈

^(n)

The

^(n)—valued Gaussian RV may be defined using the following notation:

-   -   X∈G(m_(X),Q_(X)): a Gaussian RV X with these parameters.    -   dim(X)=rank(Q_(X)): the effective dimension of the Gaussian RV        X.    -   In accordance with standard definition, any tuple of RVs X₁, . .        . , X_(k) is called jointly Gaussian if the vector (X₁, X₂, . .        . , X_(k))^(T) is a Gaussian RV.

A tuple of Gaussian random variables Y₁: Ω→

^(p) ¹ and Y₂:Ω→

^(p) ² may be denoted by (Y₁, Y₂). The covariance matrix of this tuplemay be denoted by,

( Y 1 , Y 2 ) ∈ G  ( 0 , Q ( Y 1 , Y 2 ) ) , ( 57 ) Q ( Y 1 , Y 2 ) = (Q Y 1 Q Y 1 , Y 2 Q Y 1 , Y 2 T Q Y 2 ) ∈ ( p 1 + p 2 ) × ( p 1 + p 2 ). ( 58 )

The covariance matrices Q_((Y) ₁ _(,Y) ₂ ₎ and Q_(Y) ₁ _(,Y) ₂ ∈

^(p) ¹ ^(×p) ² are distinct from each other. Any such tuple of GaussianRVs is independent if and only if Q_(Y) ₁ _(,Y) ₂ =0.

The following definitions and properties may apply for purposes ofdiscussion herein. A conditional expectation of a Gaussian RV X: Ω→

^(n) conditioned on another Gaussian RV Y: Ω→

^(p) with (X,Y)∈G(m, Q_((X,Y))) and with Q_(Y)>0 is, again a Gaussian RVwith characteristic function,

E  [ exp  ( iu T  X ) | Y ] = exp ( iu T  E [ X   Y ] - 1 2  u T Q  ( X   Y )  u ) ,  ∀ u ∈ ″ , where , ( 59 ) Q ( X , Y ) = ( Q XQ X , Y Q X , Y T Q Y ) , E  [ X | Y ] = Q X , Y  Q Y - 1  ( Y - m Y) + m X , ( 60 ) Q  ( X | Y ) = Q X - Q X , Y  Q Y - 1  Q X , Y T . (61 )

Other definitions used in the discussion herein include conditionalindependence of two RVs given another RV, and minimally Gaussianconditionally-independence, in accordance to the definitions andproperties given below:

Consider three RVs, Y_(i): Ω→

^(Pi) for i=1, 2 and X: Ω→

^(n) with joint distribution P_(Y) ₁ _(,Y) ₂ _(,X)(y₁, y₂, x).

(a) Call the RVs Y₁ and Y₂ conditionally-independent conditioned on orgiven X if

P _(Y) ₁ _(,Y) ₂ _(|X)(y ₁ ,y ₂ |x)=P _(Y) ₁ _(|X)(y ₁ |x)P _(Y) ₂_(|X)(y ₂ |x)  (62)

equivalently Y ₁

X

Y ₂ forms a Markov Chain.  (63)

The notation (Y₁, Y₂|X)∈CI may be used to denote this property.

(b) Call the RVs Y₁ and Y₂ Gaussian conditionally-independentconditioned on X if

(1) (Y₁, Y₂|X)∈CI and(2) (Y₁, Y₂|X) are jointly Gaussian RVsThe notation (Y₁, Y₂|X)∈CIG may be used to denote this property.(c) Call the RVs (Y₁, Y₂|X) minimally Gaussian conditionally-independentif(1) they are Gaussian conditionally-independent and

(2) there does not exist another tuple (Y₁, Y₂, X₁) with X₁: Ω→

^(n) ¹ such that (Y₁, Y₂|X₁)∈CIG and n₁<n.This property may be denoted by (Y₁, Y₂|X₁)∈CIG_(min)

The following simple equivalent condition for conditional independenceof Gaussian RVs may be used:

Consider a triple of jointly Gaussian RVs denoted as (Y₁, Y₂, X)∈G(0, Q)with Q_(x)>0. This triple is Gaussian conditionally-independent if andonly if

Q _(Y) ₁ _(,Y) ₂ =Q _(Y) _(1,X) Q _(X) ⁻¹ Q _(X,Y) ₂ .  (64)

It is minimally Gaussian conditionally-independent if and only ifn=dim(X)=rank (Q_(Y) ₁ _(,Y) ₂ ).

Another mathematical concept identified in the present disclosure is theweak stochastic realization problem of a Gaussian RV, that correspondsto the solution of the following problem:

Consider a Gaussian distribution G(0, Q₀) on the space

^(p) ¹ ^(+p) ² .Determine the integer n∈

and construct all Gaussian distributions on the space

^(p) ¹ ^(+p) ² ^(+n) such that, if G(0, Q₁) is such a distribution with(Y₁, Y₂,X)∈G(0, Q₁), then(1) G(0, Q₁)

=G(0, Q₀); and(2) (Y₁, Y₂|X)∈CIG_(min)Here (1) means the indicated RVs (Y₁, Y₂, X) are constructed having thedistribution G(0, Q₁) with the dimensions p₁, p₂, n∈

₊ respectively, with (Y₁, Y₂)—marginal distribution G(0, Q₀).

For weak stochastic realization problem a distribution may beconstructed. To compute the Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), its lower boundary

_(GY)(Δ₁, Δ₂), and Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁,Δ₂) it may be sufficient (and often necessary) to construct from the twotuple of jointly Gaussian RVs (Y₁, Y₂) a joint distribution (Y₁, Y₂,W)∈G(Q_((Y) ₁ _(,Y) ₂ _(,W))) parametrized by a Gaussian random variableW, such that (Y₁, Y₂) are conditionally independent given W, as definedbelow:

Consider a tuple of Gaussian RVs specified by Y=(Y₁, Y₂)∈G(0, Q_((Y) ₁_(,Y) ₂ ₎) with Y_(i): Ω→

^(n) for i=1, 2. Assume there exists a Gaussian RV W: Ω→

^(n), W∈G(0, Q_(W)) such that (Y₁, Y₂, W)∈G(0, Q_((Y) ₁ _(,Y) ₂ _(,W)))with

$Q_{({Y_{1},Y_{2},W})} = {{{\begin{pmatrix}Q_{Y_{1},Y_{1}} & Q_{Y_{1},Y_{2}} & Q_{Y_{1},W} \\Q_{Y_{1},Y_{2}}^{T} & Q_{Y_{2},Y_{2}} & Q_{Y_{2},W} \\Q_{Y_{1},X}^{T} & Q_{Y_{2},W}^{T} & Q_{W}\end{pmatrix}.{Assume}}\mspace{14mu} {that}\mspace{14mu} Q_{W}} > 0.}$

It is shown below how to construct such a random variable W by using thecanonical variable form of a tuple of Gaussian RVs.

It is recognized herein that if a Gaussian random variable W: Ω→

^(n) is constructed, as presented above, then Algorithm 1 can be used toparametrized the Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), its lower boundary

_(GY)(Δ₁, Δ₂), and Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁,Δ₂). Accordingly, Algorithm 1 may parameterize all triples of GaussianRVs of which the third makes the remaining two variables conditionallyindependent as follows:

Algorithm 1: Parametrization of all Triples of Gaussian RVs of which theThird Makes the Remaining Two Variables Conditionally Independent.

Consider the above model of a tuple of Gaussian RVs. Assume that thereexists a Gaussian RV W with W E G(0, Q_(W)) and Q_(W)>0, such that (Y₁,Y₂, W)∈CIG.

-   1. Compute first the variables,

Z ₁ =Y ₁ −E[Y ₁ |W]=Y ₁ −Q _(Y) ₁ _(,W) Q _(W) ⁻¹ W,  (65)

Z ₂ =Y ₂ −E[Y ₂ |W]=Y ₂ −Q _(Y) ₂ _(,W) Q _(W) ⁻¹ W,  (66)

-   -   then the triple (Z_(i), Z₂, W) of RVs are independent.

-   2. Represent the tuple of Gaussian RVs Y=(Y₁, Y₂)∈G(0, Q_((Y) ₁    _(,Y) ₂ ₎) by at the output of pre-encoder and prior to the    conversion into messages (W₀, W₁, W₂) by

Y ₁ =Q _(Y) ₁ _(,W) Q _(W) ⁻¹ W+Z ₁,  (67)

Y ₂ =Q _(Y) ₂ _(,W) Q _(W) ⁻¹ W+Z ₂,  (68)

-   3. Define the reproduction of the tuple of Gaussian RVs Y=(Y₁,    Y₂)∈G(0, Q_(Y) ₁ _(,Y) ₂ ₎) at the output of decoder 1 and decoder 2    by a tuple of Gaussian RVs Ŷ=(Ŷ₁,Ŷ₂)∈G(0,Q_((Ŷ) ₁ _(,Ŷ) ₂ ₎) given    by

Ŷ ₁ =Q _(Y) ₁ _(,W) Q _(X) ⁻¹ W+{circumflex over (Z)} ₁;  (69)

Ŷ ₂ =Q _(Y) ₂ _(,W) Q _(X) ⁻¹ W+{circumflex over (Z)} ₂;  (68)

such that the triple ({circumflex over (Z)}1,{circumflex over (Z)}2,W)of RVs are independent.  (71)

-   4. The square error distortions are invariant and satisfy

D _(Y) ₁ (y ₁ ,ŷ ₁)

|y ₁ −ŷ ₁|² =D _(Z) ₁ (z ₁ ,{circumflex over (z)} ₁)

|z ₁ −{circumflex over (z)} ₁|²,

D _(Y) ₂ (y ₂ ,ŷ ₂)

|y ₂ −ŷ ₂|² =D _(Z) ₂ (z ₂ ,{circumflex over (z)} ₂)

|z ₂ −{circumflex over (z)} ₂|²,

-   5. The information rates satisfy the identities

C _(W)(Y ₁ ,Y ₂;Δ₁,Δ₂)=I(Y ₁ ,Y ₂ ;W),  (74)

R ₁ ≥R _(Y) _(1|W) (Δ₁)=R _(Z) ₁ (Δ₁),  (75)

R ₂ ≥R _(Y) ₂ _(|W)(Δ₂)=R _(Z) ₂ (Δ₂),  (76)

R ₀ +R ₁ +R ₂ ≥R _(Y) ₁ _(,Y) ₂ (Δ₁,Δ₂),  (77)

I(Y ₁ ,Y ₂ ;W)=H(Y ₁ ,Y ₂)−H(Z ₁)−H(Z ₂)  (78)

H(Y ₁ ,Y ₂) is the differential entropy of(Y ₁ ,Y ₂),  (79)

H(Z ₁) is the differential entropy of Z ₁, and  (80)

H(Z ₂) is the differential entropy of Z ₂,  (81)

R _(Y) ₁ _(,Y) ₂ (Δ₁,Δ₂)=I(Y ₁ ,Y ₂ ;W)+R _(Z) ₁ (Δ₁)+R _(Z) ₁(Δ₂).  (82)

-   6. Encode the variables (W, Z₁, Z₂) into corresponding bit rates    (R₀, R₁, R₂) with corresponding binary messages (W₀, W₁, W₂) of    lengths (NR₀, NR₁, NR₂).-   7. Communicate the three bit rates (R₀, R₁, R₂) with corresponding    binary messages (W₀, W₁, W₂) via three noiseless channels from the    encoder to the two decoders, such that binary sequence W₀ is public    available to the two decoders with the binary sequence W₁ is private    available to decoder 1 and the binary sequence W₂ is private    available to decoder 2.-   8. At the decoder side, let decoder 1 convert the bit rates    sequences (W₀, W₁) into the reproduction Ŷ₁ of Y₁, and let decoder 2    covert to the bit rates sequences (W₀, W₂) into the reproduction Ŷ₂    of Y₂.

A nonsingular basis transformation. As recognized herein, to identifythe random variables (W, {circumflex over (Z)}₁, {circumflex over (Z)}₂)that are needed in Algorithm 1, the geometric approach for Gaussianrandom variables and the canonical variable form for a tuple of RV (Y₁,Y₂) may be used. The transformation of the tuple of RV (Y₁, Y₂) into thecanonical variable form may be done by applying a basis transformationto such a tuple of RV. Such a basis transformation is equivalent to apre-encoder that maps sequences of data (Y₁ ^(N), Y₂ ^(N)) into itscanonical variable form.

The systematic procedure may be implemented using Algorithm 2, asdescribed in more detail below.

A Gaussian RV Y: Ω→

^(p) may be defined with respect to a particular basis of the linearspace. The underlying geometric object of a Gaussian RV Y: Ω→

^(p) is: the σ-algebra

^(Y). A basis transformation of such a RV is then the transformationdefined by a nonsingular matrix S∈

^(P)XP such that

^(Y)=

^(SY) and the RV SY is Gaussian.

Next consider a tuple of jointly Gaussian RV (Y₁, Y₂). A basistransformation of this tuple consists of a matrix S=Blockdiag (S₁, S₂)with S₁, S₂ square and nonsingular matrices such that spaces satisfy

^(Y) ¹ =

^(S) ¹ ^(Y) ¹ and

^(Y) ² =

^(S) ² ^(Y) ² . This transformation introduces an equivalence relationon the representation of the tuple of RVs (Y₁, Y₂). Then one can analyzea canonical form for these spaces.

Below the following problem is considered and the solution provided:

Consider the tuple of jointly Gaussian RVs Y₁: Ω→

^(P) ₁ and Y₂: Ω→

^(P) ₂, with (Y₁, Y₂)∈G(0, Q). Determine a canonical form for the spaces

^(Y) ¹ ,

^(Y) ² up to linear basis transformations. This problem is furtherdefined and described below.

Consider the tuple of jointly Gaussian RVs Y_(i): Ω→

^(Pi), with Q_(Y) _(i) >0 for i=1, 2. Define the canonical variable formof these RVs if a basis has been chosen and a transformation of the RVsto this basis has been carried out such that with respect to the newbasis one has the representation,

 ( Y 1 , Y 2 ) ∈ G  ( 0 , Q cvf ) ,   Q cvf = ( I p 11 0 0 I p 21 00 0 I p 12 0 0 D 0 0 0 I p 13 0 0 0 I p 21 0 0 I p 21 0 0 0 D 0 0 I p 220 0 0 0 0 0 I p 23 ) ∈ p × p ,   p , p 1 , p 2 , p 11 , p 12 , p 13 ,p 21 , p 22 , p 23 ∈ ,  p = p 1 + p 2 , p 1 = p 11 + p 12 + p 13 , p 2= p 21 + p 22 + p 23 , p 11 = p 21 , p 12 = p 22 , ( 84 )  D = Diag  (d 1 ,  …  , d p 12 ) , 1 > d 1 ≥ d 2 ≥ … ≥ d p 12 > 0 ,   Y = ( Y 1Y 2 ) = ( Y 11 Y 12 Y 13 Y 21 Y 22 Y 23 ) , Y ij : Ω → p ij , i = 1 , 2, j = 1 , 2 , 3. ( 85 )

Then

(Y₁₁, … , Y_(1_(k₁))), (Y₂₁, … , Y_(2_(k₂)))

may be called the canonical variables and (d₁, . . . , d_(k) ₁₂ ) thecanonical correlation coefficients.

Given Y₁: Ω→

^(P) ¹ and Y₂: Ω→

^(P) ² that are jointly Gaussian RVs with (Y₁, Y₂)∈G(Q_((Y) ₁ _(,Y) ₂ ₎)and Q_(Y) _(i) >0, for i=1, 2, then the existence of a nonsingular basistransformation

S=Block-diag(S ₁ ,S ₂)∈

^((p) ¹ ^(p) ² ^()×(p) ¹ ^(+p) ² ⁾  (86)

such that with respect to the new basis (S₁Y₁, S₂Y₂)∈G(0, Q_(cvf)) hasthe canonical variable form. Algorithm 2 may be used to transform thecovariance matrix of a tuple Y₁, Y₂)∈G(0, Q) into its canonical variableform as discussed in more detail below.

Algorithm 2: Canonical Variable Form.

Transformation of a variance matrix to its canonical variable form.

Data: p₁, p₂∈Z₊, Q∈

^((p) ¹ ^(+p) ² ^()×(p) ¹ ^(+p) ² ⁾, satisfying Q=Q^(T)>0, withdecomposition

Q = ( Q 11 Q 12 Q 12 T Q 22 ) , Q 11 ∈ p 1 × p 1 , Q 22 ∈ p 2 × p 2 , Q12 ∈ p 1 × p 2 .

-   -   1. Perform singular-value decompositions:

Q ₁₁ =U ₁ D ₁ U ₁ ^(T) ,Q ₂₂ =U ₂ D ₂ U ₂ ^(T).

-   -   with U₁∈        ^(p) ¹ ^(×p) ¹ orthogonal (U₁U₁ ^(T)=I=U₁ ^(T)U₁) and

D ₁=Diag(d _(1,1) , . . . ,d _(1,p) ₁ )∈

^(p) ¹ ^(×p) ¹ ,d _(1,1) ≥d _(1,2) ≥ . . . ≥d _(1,p) ₁ >0,

-   -   and U₂, D₂ satisfying corresponding conditions.    -   2. Perform a singular-value decomposition of

D ₁ ^(−1/2) U ₁ ^(T) Q ₁₂ U ₂ D ₂ ^(−1/2) =U ₃ D ₃ U ₄ ^(T),

with U₃∈

^(p) ¹ ^(×p) ¹ , U₄∈

^(p) ² ^(×p) ² orthogonal and

D 3 = ( I p 11 0 0 0 D 4 0 0 0 0 ) ∈ p 1 × p 2 ,  D 4 = Diag  ( d 4 ,1 ,  …  , d 4 , p 12 ) ∈ p 12 × p 12 ,  1 > d 4 , 1 ≥ d 4 , 2 ≥ … ≥ d4 , p 12 > 0.

-   -   3. Compute the new variance matrix according to,

$Q_{cvf} = {\begin{pmatrix}I_{p_{1}} & D_{3} \\D_{3}^{T} & I_{p_{2}}\end{pmatrix}.}$

-   -   4. The transformation to the canonical variable representation

(Y ₁

S ₁ Y ₁ ,Y ₂

S ₂ Y ₂) is then

S ₁=(U ₃ ^(T) D ₁ ^(−1/2) U ₁ ^(T) ,S ₂ =U ₄ ^(T) D ₂ ^(−1/2) U ₂ ^(T).

The patent application discloses the following decomposition of a tuple(Y₁, Y₂)∈G(0, Q_(cvf)) into identical, correlated, and independentcomponents.

Consider a tuple (Y₁, Y₂)∈G(0, Q_(cvf)) of Gaussian RVs in the canonicalvariable form.

(a) The three components Y₁₁, Y₁₂, Y₁₃ of Y₁ are independent RVs.Similarly, the three components Y₂₁, Y₂₂, Y₂₃ of Y₂ are independent RVs.

(b) The equality Y₁₁=Y₂₁ of these RVs holds almost surely.

(c) The tuple of RVs (Y₁₂, Y₂₂) is correlated as shown by the formula

E[Y ₁₂ Y ₂₂ ^(T)]=D=Diag(d ₁ , . . . ,d ₁₂).  (87)

-   -   Note that the different components of Y₁₂ and of Y₂₂ are        independent RVs: thus Y_(12,i) and Y_(12,j) are independent, and        Y_(22,i) and Y_(22,j) are independent, and Y_(12,i) and Y_(22,j)        are independent, for all i≠j; and that Y_(12,j) and Y_(22,j) for        j=1, . . . , p₁₂=p₂₂ are correlated.

(d) The RV Y₁₃ is independent of Y₂. Similarly, the RV Y₂₃ isindependent of Y₁

Next the interpretation of the various components of the canonicalvariable form is defined.

Consider a tuple of jointly-Gaussian RVs (Y₁, Y₂)∈G(0, Q_(cvf)) in thecanonical variable form defined above. Call the various components asdefined in the next table.

Y₁₁ = Y₂₁ identical component of Y₁ with respect to Y₂ Y₁₂ correlatedcomponent of Y₁ with respect to Y₂ Y₁₃ independent component of Y₁ withrespect to Y₂ Y₂₁ = Y₁₁ identical componentof Y₁ with respect to Y₂ Y₂₂correlated component of Y₂ with respect to Y₁ Y₂₃ independent componentof Y₂ with respect to Y₁

Next, it is illustrated to a skilled person in the art that a tuple ofGaussian RVs can be brought into a convenient form for implementation byapplying the rules stated above to construct a systematic encoder. It isillustrated to a skilled person in the art that all elements ofGray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), its lower boundary

_(GW)(Δ₁, Δ₂), and Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁,Δ₂), can be computed with the aid of Algorithm 1 and Algorithm 2, bycomputing entropy, conditional entropy, mutual information, and ratedistortion function, joint rate distortion function, and conditionalrate distortion function subject to a MSE fidelity, of Gaussian vectorRVs.

A method of Calculating Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂),

_(GW)(Δ₁, Δ₂), and Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁,Δ₂) is described herein. The discussion uses concepts of a Gaussian RV,of the canonical variable form of a tuple of jointly-Gaussian randomvariables, of identical, correlated, and private information, and ofconditionally-independent Gaussian RVs, as introduced above.

The discussion herein discloses an algorithm called Algorithm 3 brokeninto Algorithm 3(a) and Algorithm 3(b); Algorithm 3(a) may be used tocompute Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁, Δ₂).

Algorithm 3(a): Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁, Δ₂).

Consider a tuple of finite-dimensional Gaussian RVs (y₁, y₂)∈G(0, Q_((y)₁ _(,y) ₂ ₎) as defined in (57), (58).

1. Compute the canonical variable form of the tuple of Gaussian RVsaccording to Algorithm 2. This yields the indices p₁₁=p₂₁, p₁₂=p₂₂, p₁₃,p₂₃, and n=p₁₁+p₁₂=p₂₁+p₂₂ and the diagonal matrix D with canonicalsingular values d_(i)∈(0,1) for i=1 . . . , n.2. Compute and output Wyner's lossy common information according to theformula,

$\begin{matrix}{{C_{W}\left( {Y_{1},{Y_{2};\Delta_{1}},\Delta_{2}} \right)}\left\{ \begin{matrix}{0,} & \begin{matrix}{{{{if}\mspace{14mu} 0} = {p_{11} = {p_{12} = {p_{21} = p_{22}}}}},} \\{{p_{13} > 0},{p_{23} > 0},}\end{matrix} \\{{\frac{1}{2}{\sum_{i = 1}^{n}{\ln \left( \frac{1 + d_{i}}{1 - d_{i}} \right)}}},} & \begin{matrix}{{{{if}\mspace{14mu} 0} = {p_{11} = p_{12}}},{p_{12} = \left. {p_{22} >} \middle| 0 \right.},} \\{{p_{13} \geq 0},{p_{23} \geq 0},}\end{matrix} \\{\infty,} & \begin{matrix}{{{{if}\mspace{14mu} p_{11}} = {p_{21} > 0}},{p_{12} = {p_{22} \geq 0}},} \\{{p_{23} \geq 0},{p_{23} \geq 0.}}\end{matrix}\end{matrix} \right.} & (88)\end{matrix}$

for (Δ₁, Δ₂) that lie in a region specified by the validity of Eq. 56

Algorithm 3(a) produces Wyner's lossy common information of the tuple.This may also be derived from the calculation of the Gray-Wyner lossyrate region presented below.

The computation of the common information may be structured by theconcepts of identical, correlated, and private components or informationof the two vectors considered. Wyner's lossy common information:

-   -   (i) in the first case of equation (88) covers the case in which        the RVs (Y₁, Y₂) are independent RVs and there are neither        identical nor correlated components,    -   (ii) in the second case of equation (88) covers the case in        which there is no identical component, but there are nontrivial        correlated components, and there may be independent components,        and    -   (iii) in the last case of equation (88) covers the case when        there is a nontrivial identical component and, possibly,        correlated and independent components.

The discussion herein discloses Algorithm 3(b) that is used to computethe Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂), and includes all necessary elements for theimplementation of compression or quantization techniques. Some of theseelements are summarized below.

Algorithm 3(b): Gray-Wyner Lossy Rate Region

_(GW)(Δ₁, Δ₂).

Consider a tuple of finite-dimensional Gaussian RVs (Y₁, Y₂)∈G(0, Q_((Y)₁ _(,Y) ₂ ₎) as defined in (57), (58).

1. Compute the representation of the tuple of RVs (Y₁, Y₂) in thecanonical variable form, as defined above, by

( Y 1 , Y 2 ) ∈ G  ( 0 , Q ( Y 1 , Y 2 ) ) ,   Y 1 , Y 2 : Ω → ″ , n∈ + , Q y 1 , y 2 = ( I D D I ) ( 89 )

at the output of the pre-encoder into three independent Gaussian RVs (X,Z₁, Z₂) where the individual components of each of these vectors areindependent RVs, and

Z ₁ ∈G(0,(I−D)),Z ₂ ∈G(0,(I−D)),X∈G(0,I).  (90)

2. Compute the rate triple (R₀, R₁, R₂) that lies is the Pangloss planeof the Gray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂) from two rate distortion functions, and the Wyner's commoninformation, with the first rate distortion function corresponding to(i) the n-dimensional vector of independent Gaussian random variables Z₁E G(0, (I−D)) subject to square error distortion D_(Z) ₁ (z₁,{circumflex over (z)}₁)=|z₁−{circumflex over (z)}₁|², given by

$\begin{matrix}{{R_{1} \geq {R_{Z_{1}}\left( \Delta_{1} \right)}} = {{\inf\limits_{P_{{{\hat{Z}}_{1}}Z_{1}}:{{E{\{{{Z_{1} - {\hat{Z}}_{1}}}^{2}\}}} \leq \Delta_{1}}}{I\left( {Z_{1}\text{;}{\hat{Z}}_{1}} \right)}} = {\sum\limits_{j = 1}^{n}{R_{Z_{i,j}}\left( \Delta_{1,j} \right)}}}} & (91)\end{matrix}$

and the second rate distortion function corresponding to(ii) n-dimensional vector of independent Gaussian RVs Z₂∈G(0,(I−D))subject to square error distortion D_(Z) ₂ (z₂,{circumflex over (z)}₂)

|z₂, −{circumflex over (z)}₂|², given by

$\begin{matrix}{{R_{2} \geq {R_{Z_{2}}\left( \Delta_{2} \right)}} = {{\inf\limits_{{P_{{\hat{Z}}_{2}Z_{2}}\text{:}E{\{{{Z_{2} - {\hat{Z}}_{2}}}^{2}\}}} \leq \Delta_{2}}{I\left( {Z_{2};{\hat{Z}}_{2}} \right)}} = {\sum\limits_{j = 1}^{n}\; {R_{Z_{2,j}}\left( \Delta_{2,j} \right)}}}} & (92)\end{matrix}$

and Wyner's common information corresponding to(iii) the tuple (Y₁, Y₂) of Gaussian RVs in the canonical variable form,(Y₁, Y₂)∈G(0, Q_((Y) ₁ _(,Y) ₂ ₎), (Y₁, Y₂):

Ω → n , n ∈ + , Q Y 1 , Y 2 = ( I D D I )

as defined above, given by

R ₀ =I(Y ₁ ,Y ₂ ;X)=H(Y ₁ ,Y ₂)−H(Z _(i))−H(Z ₂).  (93)

3. The Pangloss plane may be defined by

$\begin{matrix}{{{R_{0} + R_{1} + R_{2}} = {R_{Y_{1},Y_{2}}\left( {\Delta_{1},\Delta_{2}} \right)}},} & (94) \\{\mspace{140mu} {= {{I\left( {Y_{1},{Y_{2};X}} \right)} + {R_{Z_{1}}\left( \Delta_{1} \right)} + {{R_{Z_{2}}\left( \Delta_{2} \right)}.}}}} & (95)\end{matrix}$

for (Δ₁, Δ₂) that lie in a region specified by the validity of Eq. 564. Compute the rate triple (R₀, R₁, R₂) that lies the Gray-Wyner lossyrate region

_(GW)(Δ₁, Δ₂) from the two rate distortion functions, (91) and (92) andthe additional sum rate

R ₀ +R ₁ +R ₂ ≥R _(Y) ₁ _(,Y) ₂ (Δ₁,Δ₂),  (96)

R _(Y) ₁ _(,Y) ₂ (Δ₁,Δ₂)=I(Y ₁ ,Y ₂ ;X)+R _(Z) ₁ (Δ₁)+R _(Z) ₂(Δ₂).  (97)

5. Compute the representation of the tuple of RVs (Ŷ₁, Ŷ₂) at the outputof the pre-decoder 1 and pre-decoder 2 into three independent randomvariables (X, {circumflex over (Z)}₁, {circumflex over (Z)}₂), where theindividual components of each of these vectors are independent RVs (thedetails are shown in Algorithm 4).

The proof that Algorithm 3(b) is correct and produces Wyner's lossycommon information of the rate triple (R₀, R₁, R₂) that lies in theGray-Wyner Lossy Rate Region

_(GW)(Δ₁, Δ₂) and its lower boundary

_(GW)(Δ₁, Δ₂) follows directly from the material herein.

As described above, Algorithm 4 may be used to represent Ŷ₁ ^(N) at theoutput of pre-decoder 1, and the representation of Ŷ₂ ^(N) at the outputof pre-decoder 2, that consists of two independent parts; the signalD^(1/2)X that is common to both, and their private parts that may berealized by parallel additive Gaussian noise channels. Algorithm 4includes the realization of (Y₁ ^(N), Y₂ ^(N)) at the output of thepre-endcoder.

Algorithm 4: Realizations of processes that operate at Gray-Wyner lossyrate region

_(GW)(Δ₁, Δ₂),

_(GW)(Δ₁, Δ₂), and Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁,Δ₂).

Consider two sequences of multivariate jointly Gaussian discrete-timeprocesses (Y₁ ^(N), Y₂ ^(N)) with square error distortion as defined by(1)-(6).

Apply Algorithm 2 to obtain a tuple of RVs (Y₁, Y₂) in the canonicalvariable form as defined above. Restrict attention to the correlatedparts of these RVs. Thus, the RVs (Y₁, Y₂) have the same dimensionn=p₁=p₂, and their covariance matrix D∈

^(n×n) is a nonsingular diagonal matrix with on the diagonal orderedreal-numbers in the interval (0,1), and specified by

 ( Y 1 , Y 2 ) ∈ G  ( 0 , Q ( Y 1 , Y 2 ) ) = P 0 , Y 1 , Y 2  :  Ω → n , n ∈ + , ( 98 )  Q y 1 , y 2 =  ( I D D I ) , D =  Diag  ( d1 , d 2 , …  , d n ) ∈ n × n , 1 > d 1 ≥ d 2 ≥ … ≥ d n > 0. ( 99 )

1. The representations of the n-dimensional jointly independent sequenceof RVs (Y₁ ^(N), Y₂ ^(N))={(Y_(1,i), Y_(2,i))}_(i=1) ^(N) at the outputof the pre-encoder into three independent sequences of RVs (Z₁ ^(N), Z₂^(N), X^(N)), each of which is jointly independent sequence, to generateof the corresponding binary messages (W₁, W₂, W₀), are given by

V:Ω→

^(n) ,V∈G(0,I),

^(V),

^(Y) ¹ ∨

^(Y) ² , are independent σ-algebras,

L ₁ =L ₂ =D ^(1/2)(I+D)⁻¹∈

^(n×n),  (100)

L ₃=(I−D)^(1/2)(I+D)^(−1/2)∈

^(n×n) ,L ₁ ,L ₂ ,L ₃, are diagonal matrices,  (101)

X=L ₁ Y ₁ +L ₂ Y ₂ +L ₃ V,X:Ω→

^(n),  (102)

Z ₁ =Y ₁ −D ^(1/2) X,Z ₁:Ω→

^(n),  (103)

Z ₂ =Y ₂ −D ^(1/2) X,Z ₂:Ω→

^(n).  (104)

Z ₁ ∈G(0,(I−D)),Z ₂ ∈G(0,(I−D)),X∈G(0,I);  (105)

(Z ₁ ,Z ₂ ,X), are independent.  (106)

The representation of canonical vectors is:

Y ₁ =D ^(1/2) X+Z ₁ ,Y ₂ =D ^(1/2) X+Z ₂.  (107)

Note that, in addition, each of the RVs Z₁, Z₂, X has independentcomponents.2. The computation of the rate triple (R₁, R₂, R₀) that lies is thePangloss plane of the Gray-Wyner lossy rate region

_(GW)(Δ_(X), Δ₂) may be computed from the two rate distortion functionsand Wyner's lossy common information given by

$\begin{matrix}{\mspace{79mu} {{R_{1} \geq {R_{Z_{1}}\left( \Delta_{1} \right)}} = {\frac{1}{2}{\sum\limits_{j = 1}^{n}\; {\log\left( {{\frac{\left( {1 - d_{j}} \right)}{\Delta_{1,j}} = {\sum\limits_{j = 1}^{n}\; {R_{Z_{1,j}}\left( \Delta_{1,j} \right)}}},} \right.}}}}} & (108) \\{\mspace{79mu} {{{where}\mspace{14mu} \Delta_{1,j}} = \left\{ {\begin{matrix}{\lambda,} & {\lambda \leq {1 - d_{j}}} \\{{1 - d_{j}},} & {\lambda \geq {1 - d_{j}}}\end{matrix},\mspace{79mu} {{{and}\mspace{14mu} {\sum\limits_{j = 1}^{n}\Delta_{1,j}}} = \Delta_{1}},{\Delta_{1} \in \left( {0,\infty} \right)}} \right.}} & (109) \\{\mspace{79mu} {{R_{2} \geq {R_{Z_{2}}\left( \Delta_{2} \right)}} = {\frac{1}{2}{\sum\limits_{j = 1}^{n}\; {\log\left( {{\frac{\left( {1 - d_{j}} \right)}{\Delta_{2,j}} = {\sum\limits_{j = 1}^{n}\; {R_{Z_{2,j}}\left( \Delta_{2,j} \right)}}},} \right.}}}}} & (110) \\{\mspace{79mu} {{{where}\mspace{14mu} \Delta_{2,j}} = \left\{ {\begin{matrix}{\lambda,} & {\lambda \leq {1 - d_{j}}} \\{{1 - d_{j}},} & {\lambda \geq {1 - d_{j}}}\end{matrix},\mspace{79mu} {{{and}\mspace{14mu} {\sum\limits_{j = 1}^{n}\Delta_{2,j}}} = \Delta_{2}},{\Delta_{2} \in \left( {0,\infty} \right)}} \right.}} & (111) \\{R_{0} = {{I\left( {Y_{1},{Y_{2};X}} \right)} = {{{H\left( {Y_{1},Y_{2}} \right)} - {H\left( Z_{1} \right)} - {H\left( Z_{2} \right)}} = {\frac{1}{2}{\sum\limits_{i = 1}^{n}\; {\log \left( \frac{1 + d_{i}}{1 - d_{i}} \right)}}}}}} & (112) \\{{R_{0} + R_{1} + R_{2}} = {{R_{Y_{1},Y_{2}}\left( {\Delta_{1},\Delta_{2}} \right)} = {{I\left( {Y_{1},{Y_{2};X}} \right)} + {R_{Z_{1}}\left( \Delta_{2} \right)} + {{R_{Z_{2}}\left( \Delta_{2} \right)}.}}}} & (113)\end{matrix}$

R_(Z) ₁ (Δ₁) is the rate distortion functions of Gaussian vector Z₁ withindependent components, subject to a square error distortion function

D _(Y) ₁ (y ₁ ,ŷ ₁)

|y ₁ −ŷ ₁|² =D _(Z) ₁ (z ₁ ,{circumflex over (z)} ₁)

|z ₁ −{circumflex over (z)} ₁|²,

R_(Z) ₂ (Δ₂) is the rate distortion functions of Gaussian vector Z₂ withindependent components, subject to a square error distortion function

D _(Y) ₂ (y ₂ ,ŷ ₂)

|y ₂ −ŷ ₂|² =D _(Z) ₂ (z ₂ ,{circumflex over (z)} ₂)

|z ₂ −{circumflex over (z)} ₂|²,

3. The computation of rate triple (R₁, R₂, R₀) that lies in theGray-Wyner lossy rate region

_(GW)(Δ₁, Δ₂) corresponds to

R ₀ +R ₁ +R ₂ ≥R _(Y) ₁ _(,Y) ₂ (Δ₁,Δ₂)=I(Y ₁ ,Y ₂ ;X)+R _(Z) ₁ (Δ₂)+R_(Z) ₂ (Δ₂).  (116)

such that Eq. 108 and Eq. 110 are satisfied.4. The reproductions (Ŷ₁ ^(N), Ŷ₂ ^(N))={(Ŷ_(i,1),Ŷ_(2,i))}=_(i=1) ^(N)of the n-dimensional RVs (Y₁ ^(N), Y₂ ^(N)), at the output of thepre-decoder 1 and pre-decoder 2, that ensure the MSEs are satisfied, arejointly independent, and are given by

Ŷ ₁ =D ^(1/2) X+{circumflex over (Z)} ₁ ,Ŷ ₂ =D ^(1/2) X+{circumflexover (Z)} ₂,

{circumflex over (Z)} ₁ =A ₁ Z ₁ +V ₁ ,{circumflex over (Z)} ₂ =A ₂ Z ₂+V ₂,

A ₁ =I−Λ ₁(I−D)⁻¹ ,A ₂ =I−Λ ₂(I−D)⁻¹,

Λ₁=Diag(Δ_(1,1),Δ_(1,2), . . . ,Δ_(1,n))∈

^(n×n),Λ₂=Diag(Δ_(2,1), . . . ,Δ_(2,n))∈

^(n×n)  (117)

0≤Δ_(1,1)≤1−d ₁≤Δ_(1,2)≤1−d ₂≤ . . . ≤Δ_(1,n)≤1−d _(n)<1,  (118)

0≤Δ_(2,1)≤1−d ₂≤Δ_(2,2)≤1−d ₂≤ . . . ≤Δ_(2,n)≤1−d _(n)<1,  (119)

V ₁ ∈G(0,A ₁Λ₁),V ₂ ∈G(0,Δ₂Λ₂),  (120)

(V ₁ ,V ₂ ,X), are independent.  (121)

Note that the components of V₁ are independent, the components of V₂ areindependent, and hence the components of Z₁ are independent, and thecomponents of Z₂ are independent.5. The rate triple (R₀, R₁, R₂) satisfies the equations

R ₀ =I(Y ₁ ,Y ₂ ;X)=H(Y ₁ ,Y ₂)−H(Y ₁ ,Y ₂ |X)=H(Y ₁ ,Y ₂)−H(Z ₁)−H(Z₂).  (122)

R ₁ ≥R _(Y) ₁ _(|X)(Δ₁)=R _(Z) ₁ (Δ₁),  (123)

R ₂ ≥R _(Y) ₂ _(|X)(Δ₂)=R _(Z) ₂ (Δ₂),  (124)

C _(X)(Y ₁ ,Y ₂;Δ₁,Δ₂)=I(Y ₁ ,Y ₂ ;X),  (125)

H(Y ₁ ,Y ₂), is the differential entropy of (Y ₁ ,Y ₂) at the output ofthe pre-encoder defined by (107),  (126)

H(Z ₁) is the differential entropy of Z ₁, and  (127)

H(Z ₂) is the differential entropy of Z ₂.  (128)

The conditional rate distortion function satisfies

R _(Y) ₁ _(,Y) ₂ (Δ₁,Δ₂)=R _(Y) ₁ _(|X)(Δ₁)+R _(Y) ₂ _(|X)(Δ₂)=R _(Z) ₁(Δ₁)+R _(Z)(Δ₂).  (129)

The joint rate distortion function satisfies

$\begin{matrix}{{R_{Y_{1},Y_{2}}\left( {\Delta_{1},\Delta_{2}} \right)} = {{R_{Y_{1}X}\left( \Delta_{1} \right)} + {R_{Y_{2}X}\left( \Delta_{2} \right)} + {I\left( {Y_{1},{Y_{2};X}} \right)}}} & {{~~~~~~~~~~~~~}(130)} \\{= {{R_{Z_{1}}\left( \Delta_{1} \right)} + {R_{Z_{2}}\left( \Delta_{2} \right)} + {{I\left( {Y_{1},{Y_{2};X}} \right)}.}}} & {(131)}\end{matrix}$The sum rate satisfies

R ₀ +R ₁ +R ₂ =R _(Y) ₁ _(,Y) ₂ (Δ₁,Δ₂)=I(Y ₁ ,Y ₂ ;X)+R _(Z) ₁ (Δ₁)+R_(Z) ₂ (Δ₂).  (132)

6. Wyner's lossy common information is attained for the RV X defined initem 2,

C _(W)(Y ₁ ,Y ₂;Δ₁,Δ₂)=C _(X)(Y ₁ ,Y ₂;Δ₁,Δ₂)=I(Y ₁ ,Y ₂ ;X).  (133)

and the Wyner's lossy common information of these two random variablesequals,

$\begin{matrix}{\mspace{79mu} {{C_{W}\left( {Y_{1},{Y_{2};\Delta_{1}},\Delta_{2}} \right)} = {C_{X}\left( {Y_{1},{Y_{2};\Delta_{1}},\Delta_{2}} \right)}}} & (134) \\{{= {\frac{1}{2}{\sum\limits_{i = 1}^{n}\; {\ln \left( \frac{1 + d_{i}}{1 - d_{i}} \right)}}}},{{{for}\mspace{14mu} 0} \leq \Delta_{1} \leq {\sum\limits_{j = 1}^{n}\left( {1 - d_{j}} \right)}},{0 \leq \Delta_{2} \leq {\sum\limits_{j = 1}^{n}{\left( {1 - d_{j}} \right).}}}} & (135)\end{matrix}$

Wyner's lossy common information is specified by Eqs. 133-135 for (Δ₁,Δ₂) that lie in a region specified by the validity of Eq. 567. The realizations or representations, given in item 1 of pre-encodersignals (Y₁, Y₂), and item 4 of the presentation Ŷ₁ at the output ofpre-decoder 1, and the representation of Ŷ₂ at the output of pre-decoder2, consists of two independent parts, the signal D^(1/2)X that is commonto both, and their private channels that are realized by paralleladditive Gaussian noise channels, given by

Ŷ ₁ ^(P) =Z ₁ =A ₁ Z ₁ +V ₁ ,Ŷ ₂ ^(P) ={circumflex over (Z)} ₂ =A ₂ Z ₂+V ₂.  (136)

Current compression or quantization techniques are readily applicable,i.e., the pre-decoder reproduction sequences (Ŷ₁ ^(N), Ŷ₂^(N))=)={(Ŷ_(i,1),Ŷ_(2,i))}=_(i=1) ^(N), are such that (Ŷ_(i,1),Ŷ_(2,i))are jointly independent and identically distributed, to produce themessages (W₀, W₁, W₂).

That is, in Algorithm 4, the representation of Ŷ₁ ^(N) at the output ofpre-decoder 1, and the representation of Ŷ₂ ^(N) at the output ofpre-decoder 2, consists of two independent parts; the signal D^(1/2)Xthat is common to both, and their private parts that are realized byparallel additive Gaussian noise channels. It is then clear to thosefamiliar with the art of implementing compression or quantizationtechniques, such as, lattice codes, joint entropy coded ditheredquantization (ECDQ), etc., that the messages (W₁, W₂, W₀) can beproduced.

The present application discloses that Algorithms 1-4 may also begeneralized to time-invariant Gaussian processes, in which the twooutput processes (Y₁ ^(N), Y₂ ^(N)) are generated by the representationas described above.

The starting point of the analysis is the probability distribution of atuple of stationary jointly-Gaussian stochastic processes. To be able tocompute with these processes it may be assumed that the tuple is theoutput of a weak Gaussian stochastic realization of these processes inthe form of a time-invariant Gaussian system of which the stateinfluences both components of the output process.

Define a Gaussian white noise process as a sequence of independentGaussian RVs with distribution as V(t)∈G(0, Q_(V)) for all t∈T and withQ_(V)∈

^(m) ^(v) ^(×m) ^(v) and Q_(V)=Q_(V)≥0.

Define a time-invariant Gaussian (stochastic) system representation withtwo output processes as the following stochastic dynamic system withrepresentation (7) and (8). Needed is also the backward representationof this Gaussian stochastic system of the form,

X(t−1)=A _(b) X(t)+MV _(b)(t),X(t ₁)=X ₁,  (137)

Y(t−1)=C _(b) X(t)+NV _(b)(t),  (138)

V _(b) :Ω×T→

^(m) ^(v) , a Gaussian white noise process,

V _(b)(t)∈G(0,Q _(V)),Q _(V)∈

^(m) ^(v) ^(×m) ^(v) ,Q _(V) =Q _(V) ^(T)≥0,

F ^(X) ⁰ ,F _(t) ₁ ^(V) ^(b) , are independent σ-algebras,

A _(b)∈

^(n×n) ,M∈

^(n×m) ^(v) ,C _(b)∈

^(p×n) ,N∈

^(p×m) ^(v) .

Presented below are the relations of the system matrices of the forwardand the backward system representation.

Below concepts of stochastic realization theory of stationary Gaussianprocesses are used. The weak Gaussian stochastic realization theory neednot be identified with strong Gaussian stochastic realization theory.The strong realization problem is not needed for the problem herein.

It is a result of weak stochastic realization theory of Gaussianprocesses that a tuple of stationary jointly-Gaussian stochastic processadmits a realization as the outputs of a time-invariant Gaussian systemas defined above if and only if the covariance function of the outputprocess satisfies a rank condition on the Hankel matrix of thecovariance function. In addition, the Gaussian system may be called aminimal weak Gaussian realization of the output processes if and only ifparticular conditions hold which are defined below.

Consider the time-invariant Gaussian system,

X(t+1)=AX(t)+MV(t),X(t ₀)=X ₀.  (139)

Call the matrix tuple (A,MQ_(V) ^(1/2))∈

^(n×n)×

^(n×m) ^(r) a supportable pair if

rank(MQ _(V) ^(1/2) AMQ _(V) ^(1/2) . . . A ^(n−1) MQ _(V)^(1/2))=n,  (140)

and call the Gaussian system supportable if (A,MQ_(V) ^(1/2)) is asupportable pair.

Call the matrix tuple (A,C)∈

^(n×n)×

^(p×n) an observable pair if

$\begin{matrix}{{{{rank}\mspace{14mu} \begin{pmatrix}C \\{CA} \\\vdots \\{CA}^{n - 1}\end{pmatrix}} = n},} & (141)\end{matrix}$

and call the Gaussian system stochastically observable if (A,C) is anobservable pair,

Call the matrix tuple (A_(b),C_(b))∈

^(n×n)×

^(p×n) a co-observable pair if

$\begin{matrix}{{{{rank}\mspace{14mu} \begin{pmatrix}C_{b} \\{C_{b}A_{b}} \\\vdots \\{C_{b}A_{b}^{n - 1}}\end{pmatrix}} = n},} & (142)\end{matrix}$

and call the Gaussian system stochastically co-observable if (A_(b),C_(b)) is an observable pair. Note that the matrix tuple (A_(b), C_(b))relates to the backward representation of the Gaussian system, see theequations (137,138).

The equation (140) may be called a controllability condition of anassociated linear system. For stochastic systems the expressionsupportable pair is used rather than controllable pair because the useof the first term refers to the support of the state process rather thanto a controllability condition. In a Gaussian control systemrepresentation both the matrix B and the matrix M appear, and then boththe concepts of a supportable pair and of a controllable pair areneeded.

The weak stochastic realization of the output process is called aminimal realization if the dimension n of the state space of the abovesystem is minimal over all realizations. The Gaussian system definedabove may be called a minimal weak realization of its output process ifand only if the following conditions hold:

(1) (A, MQ_(v) ^(1/2)) is a supportable pair;(2) the system is stochastically observable which is equivalent with (A;C) being an observable pair; and(3) the system is stochastically co-observable which is equivalent with(A_(b), C_(b)) being a co-observable pair, where (A_(b), C_(b)) are thesystem matrices of the corresponding backward system representation.Condition (1) is equivalent to the condition that the support of thestate process equals the state space X=

^(n) which again is equivalent to x(t)∈G(0, Q_(X)) with Q_(X)>0. Aconsequence of the stationarity of the output process and of theminimality of the weak stochastic realization is that the system matrixA has all its eigenvalues strictly inside the unit disc, denoted as spec(A)⊂

_(o). Needed is the probability measure of the tuple of output processesof the Gaussian system defined above. Because of the conditionsmentioned above, in particular spec(A)⊂

_(o), there exists an invariant measure of the state process which isdenoted by X(t)∈G(0, Q_(X)) where the matrix Q_(X)=Q_(X) ^(T)≥0 is theunique solution of the discrete Lyapunov equation,

Q _(X) =AQ _(X) A ^(T) +MQ _(V) M ^(T).  (143)

Because in addition (A, MQ_(v) ^(1/2)) is a supportable pair, Q_(X)>0.The probability distribution of the output is then Y(t)∈G(0, Q_(Y)) with

Q ^(Y) =CQ _(X) C ^(T) +NQ _(V) N ^(T).  (144)

The situation now obtained is that the tuple (Y₁, Y₂) of stationaryjointly-Gaussian processes has an invariant probability distributionY(t)∈G(0, Q_(Y)).

The relations between the forward and the backward Gaussian systemrepresentations are next discussed. It may be assumed that arepresentation is considered satisfying Q_(X)>0 which is equivalent to(A, MQ_(v) ^(1/2)) being a supportable pair if exponential stability isassumed. Below use is made of the fact that the dimension of the noiseprocess equals m_(v)=n+p. From a forward system representation oneobtains the matrices of the backward system representation by theformulas,

A _(b) =Q _(X) A ^(T) Q _(X) ⁻¹ ,C _(b) =CQ _(X) A ^(T) Q _(X) ⁻¹ +NQ_(V) M ^(T) Q _(X) ,M=(I _(n) 0),N=(0 I _(p))  (145)

The variance of the noise of the forward and the backwardrepresentations is the same, V(t)∈G(0, Q_(V)) and V_(b)(t)∈G(0, Q_(V)),but the noise processes are different.

The system matrices of the forward system representation in terms ofthose of the backward system representation are,

A=Q _(X) A _(b) ^(T) Q _(X) ⁻¹ ,C=C _(b) Q _(X) A _(b) ^(T) Q _(X) ⁻¹+NQ _(V) M ^(T) Q _(x) ⁻¹.  (146)

Case (1): Time-Invariant Stationary Gaussian Processes.

The time-invariant Kalman filter system is needed which is describednext. Consider a time-invariant Gaussian system as defined above andassume that the properties of Def. 4.9 hold. Then there exists an uniquesolution Q_(f) of the algebraic Riccati equation (147) with the sideconditions (148), (149),

Q _(f) =AQ _(f) A ^(T) +MM ^(T)−[AQ _(f) C ^(T) +MQ _(V) N ^(T)][CQ _(f)C ^(T) +NQ _(V) N ^(T)]⁻¹[AQ _(f) C ^(T) +MQ _(V) N ^(T)]^(T),  (147)

Q _(f) =Q _(f) ^(T)≥0,Q _(f)∈

^(n×n),  (148)

spec(A−KC)⊂

_(o), where,  (149)

K=[AQ _(f) C ^(T) +MQ _(V) N ^(T)][CQ _(f) C ^(T) +NQ _(V) N^(T)]⁻¹.  (150)

The time-invariant Kalman filter system is then,

{circumflex over (X)}(t)=E[X(t)|F _(t-1) ^(Y)],{circumflex over(X)}:Ω×T→

^(n),  (151)

{circumflex over (X)}(t+1)=A{circumflex over (X)}(t)+K[Y(t)−C{circumflexover (X)}(t)].  (152)

Define then the innovation process as the process,

V (t)=Y(t)−C{circumflex over (X)}(t), V :Ω×T→

^(p),  (153)

Q _(V=) CQ _(f) C ^(T) +NQ _(V) N ^(T).  (154)

It can then be proven that, in the case of stationary considerations,the innovation process is a sequence of independent Gaussian randomvariables with V(t)∈G(0, Q_(V)). The filter system can thus be writtenas driven by the innovations process of the form,

{circumflex over (X)}(t+1)=A{circumflex over (X)}(t)+KV (t).  (155)

Another view point is to compare the original Gaussian system to itsKalman realization defined by,

{circumflex over (X)}(t+1)=A{circumflex over (X)}(t)+KV (t),  (156)

Y(t)=C{circumflex over (X)}(t)+ V (t).  (157)

The Kalman realization of the output process is a Gaussian system ofwhich the state process is such that any time its state {circumflex over(X)}(t)=E[X(t)|F_(t-1) ^(Y)] is measurable with respect to the pastoutputs up to time t−1. A Kalman realization is unique up to astate-space transformation by a nonsingular matrix S∈

^(n×n) as in X(t)=SX(t).

Algorithms analogous to Algorithms 1-4 may be developed in terms of theinnovation process. The support for this statement is that this processis a sequence of independent random variables each of which has aGaussian probability distribution with V(t)∈G(0, Q _(V) ). From thispoint onwards Algorithms 1-4 can be developed as detailed above.

It is now demonstrated how to compute Wyner's lossy common informationin terms of the innovation process.

Algorithm 3(a):

Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁, Δ₂) for randomprocesses. Consider the time-invariant Gaussian system as defined above.

Input: The integers n, m_(v), p₁, p₂=p₁+p₂ and matrices A∈

^(n×n), C∈

^(p×n), M∈

^(n×m), N∈

^(p×m) ^(v) , and Q_(V)∈

^(m) ^(v) ^(×m) ^(v) with Q_(V)=Q_(V) ^(T)≥0 and NQ_(V)N^(T)>0.1. Check whether the Gaussian system is a minimal weak Gaussianstochastic realization of its output processes by checking the followingconditions:(1) spec (A)⊂

_(o);(2) (A, MQ_(v) ^(1/2)) is a controllable pair;(3) (A,C) is an observable pair which is equivalent with the realizationbeing stochastically observable; and(4) (A_(b), C_(b)) being an observable pair which is equivalent with therealization being stochastically co-observable.See equation (145) for the formulas of (A_(b), C_(b)) in terms of thematrices defined above. Check also if rank (C)=p. Continue if all theconditions hold and stop otherwise.2. Solve the discrete-time algebraic Riccati equation with sideconditions for the matrix Q_(f) according to,

Q f = AQ f  A T + MQ V  M T + - [ AQ f  C T +   MQ V  N T ]  [ CQ f C T + NQ V  N T ] - 1  [ AQ f  C T + MQ V  N T ] T , ( 158 )  Q f= Q f T ≥ 0 , Q f ∈ n × n , ( 159 )  spec  ( A - KC ) ⋐ a , where (160 )  K = [ AQ f  C T + MQ V  N T ]  [ CQ f  C T + NQ V  N T ] -1 . ( 161 )

Because of the conditions of Step 1, the equation (158), with the sideconditions (159,160), has an unique solution.3. Compute the variance of the innovations process according to,

Q _(V) =CQ _(f) C ^(T) +NQ _(V) N ^(T).  (162)

Then

$Q_{\overset{\_}{V}} = {Q_{\overset{\_}{V}}^{T} > 0}$

because of the assumptions.4. Compute the canonical variable decomposition of the invariant measureof the innovations process based on the partition of the output in the(p₁, p₂) components according to Algorithm 2. From this obtain theindices (p₁₁, p₁₂, p₂₃)∈

⁴ and the diagonal matrix D∈

^((p) ¹² ^(×p) ¹² ⁾ if p₁₂>0.5. Compute Wyner's lossy common information C_(W)(Y₁, Y₂; Δ₁, Δ₂)according to the formula

$\begin{matrix}{{C_{W}\left( {Y_{1},{Y_{2};\Delta_{1}},\Delta_{2}} \right)} = \left\{ {{{\begin{matrix}{0,} & {{{{if}\mspace{14mu} 0} = {p_{11} = {p_{12} = {p_{21} = p_{22}}}}},{p_{13} > 0},{p_{23} > 0},} \\{{{\frac{1}{2}{\sum_{i = 1}^{n}{\ln \left( {1 + \frac{2d_{i}}{1 - d_{i}}} \right)}}} \in \left( {0,\infty} \right)},} & {{{{if}\mspace{14mu} 0} = {p_{11} = p_{12}}},{p_{12} = {p_{22} > 0}},{p_{13} \geq 0},{p_{23} \geq 0},} \\{\infty,} & {{{{if}\mspace{14mu} p_{11}} = {p_{21} > 0}},{p_{12} = {p_{22} \geq 0}},\; {p_{23} \geq 0},{p_{23} \geq 0}}\end{matrix}.0} \leq \Delta_{1} \leq {\sum\limits_{j = 1}^{n}\left( {1 - d_{j}} \right)}},{0 \leq \Delta_{2} \leq {\sum\limits_{j = 1}^{n}{\left( {1 - d_{j}} \right).}}}} \right.} & (163)\end{matrix}$

Case (2): Time-Invariant Stationary Gaussian Processes.

This corresponds to Case (2) described above. As recognized herein,Algorithms 1-4 can be derived for time-invariant stationary Gaussianprocesses as described above. It should be clear to those familiar withthe art of rate distortion theory of stationary processes that allinformation measures in Algorithms 1-4 should be replaced by those ofstationary Gaussian processes.

The following describes example implementations of computing Wyner'slossy common information through the application of some of the aspectsof Algorithms 1-4.

Special Cases:

Consider the case of a tuple of Gaussian vectors with only privatecomponents. Hence the Gaussian distribution is

( Y 13 , Y 23 ) ∈ G  ( 0 , Q ( Y 13 , Y 23 ) ) ,  Q ( Y 13 , Y 23 ) =( I 0 0 I ) , Y 13  :   Ω → p 13 , Y 23  :   Ω → p 23 ( 164 )

(a) The minimal σ-algebra which makes Y₁₃, Y₂₃ conditionally independentis the trivial σ-algebra denoted by

₀={Ø, Ω}. Thus, (

^(Y) ¹³ ,

^(Y) ²³ |

₀)∈CI. The random variable X in this case is constant X₃=0∈

, hence

^(X) ³ =

₀.(b) then

C _(W)(Y ₁₃ ,Y ₂₃;Δ₁,Δ₂)=I(Y ₁₃ ,Y ₂₃ ;X ₃)=0.  (165)

(c) The signal representations for the generation of the binarysequences (W₁, W₂), i.e. W₀ is not present because lossy commoninformation is zero, are

Z ₁ =Y ₁₃ ,Z ₂ =Y ₂ ,X=0,  (166)

The reproductions (Z₁, Z₂) of ({circumflex over (Z)}₁, {circumflex over(Z)}₂) at the two decoders are,

{circumflex over (Z)} ₁ =Ŷ ₁₃ =A ₁ Z ₁ +V ₁ ,{circumflex over (Z)} ₂ =Ŷ₂₃ =A ₂ Z ₂ +V ₂.  (167)

given in Algorithm 4, with D the zero matrix, and n=p₁₃=p₂₃.

(The Case of Identical Gaussian Vectors) Consider the case of a tuple ofGaussian vectors with only the identical part. Hence the Gaussiandistribution is

Y 11  :   Ω → p 11 , Y 21  :   Ω → p 21 , p 11 = p 21 ,  ( Y 11 ,Y 21 ) ∈ G  ( 0 , Q ( Y 11 , Y 21 ) ) , Q ( Y 11 , Y 21 ) = ( I I I I ),  Y 11 = Y 21   almost   surely . ( 168 )

(a) The only minimal σ-algebra which makes Y₁₁ and Y₂₁ Gaussianconditional-independent is

^(Y) ¹¹ =

^(Y) ²¹ . The state variable is thus, X₁=Y₁₁=Y₂₁ and

^(X)=

^(Y) ¹¹ =

^(Y) ²¹ .(b) Wyner's lossy common information equals C_(W)(Y₁₁, Y₂₁; X)=+∞. Thisexpected because one needs infinite number of bits to represent anysample generated by the common part which is Gaussian distributed.

General Cases.

Consider a tuple of arbitrary Gaussian random variables. Then the commoninformation is computed by a decomposition and by the use of theformulas obtained earlier.

Assume that the tuple is already transformed to the canonical variablerepresentation. Note that then the following three tuples of randomvariables are independent, (Y₁₁, Y₂₁), (Y₁₂, Y₂₂), (Y₁₃, Y₂₃).

Consider a tuple of Gaussian random variables (Y₁, Y₂)∈G(0, Q_(cvf)) asdescribed and decomposed according to Algorithm 2.

(a) Then:

$\begin{matrix}{{C_{W}\left( {Y_{1},{Y_{2};\Delta_{1}},\Delta_{2}} \right)} = {{{C_{W}\left( {Y_{11},{Y_{21};\Delta_{1}},\Delta_{2}} \right)} + {C_{W}\left( {Y_{12},{Y_{22};\Delta_{1}},\Delta_{2}} \right)} + {C_{W}\left( {Y_{13},{Y_{23};\Delta_{1}},\Delta_{2}} \right)}} = \left\{ {{{\begin{matrix}{0,} & {{if},{p_{13} > 0},{p_{23} > 0},{p_{11} = {p_{12} = {p_{21} = {p_{22} = 0}}}},} \\{{\frac{1}{2}{\sum_{i = 1}^{n}{\ln \left( \frac{1 + d_{i}}{1 - d_{i}} \right)}}},} & {{{{if}\mspace{20mu} p_{12}} = {p_{22} > 0}},{p_{11} = {p_{21} = 0}},{p_{13} \geq 0},{p_{23} \geq 0},} \\{{+ \infty},} & {{{if}\mspace{14mu} p_{11}} = {p_{21} > 0}}\end{matrix}.0} \leq \Delta_{1} \leq {\sum\limits_{j = 1}^{n}\left( {1 - d_{j}} \right)}},{0 \leq \Delta_{2} \leq {\sum\limits_{j = 1}^{n}{\left( {1 - d_{j}} \right).}}}} \right.}} & (169)\end{matrix}$

In particular cases one computes the canonical variable decomposition ofthe tuple (Y₁, Y₂) and obtains the indices, (p₁₁, p₁₂, p₁₃) and (p₂₁,p₂₂, p₂₃). Then:

$\begin{matrix}{{{C_{W}\left( {Y_{11},{Y_{21};\Delta_{1}},\Delta_{2}} \right)} = {+ \infty}},{{{{if}\mspace{14mu} p_{11}} = {p_{12} > 0}};}} & (170) \\{{{C_{W}\left( {Y_{31},{Y_{32};\Delta_{1}},\Delta_{2}} \right)} = 0},{{{{if}\mspace{14mu} p_{31}} > {0\mspace{14mu} {and}\mspace{14mu} p_{23}} > 0};}} & (171) \\{{{C_{W}\left( {Y_{12},{Y_{22};\Delta_{1}},\Delta_{2}} \right)} = {\frac{1}{2}{\sum\limits_{i = 1}^{n}\; {\ln \left( \frac{1 + d_{i}}{1 - d_{i}} \right)}}}},{{{if}\mspace{14mu} p_{12}} = {p_{22} > 0.}}} & (172)\end{matrix}$

Thus C_(W)(Y₁₂, Y₂₂; Δ₁, Δ₂) given by (172) is the most interestingvalue if defined.(b) The random variable X defined below, is such that Wyner's lossycommon information is attained by the mutual information for this randomvariable.

 X : Ω → n , n ∈ + ,  n 1 = p 11 = p 21 , n 2 = p 12 = p 22 , n 1 + n2 = n ,  X = ( X 1 X 2 ) , X 1 : Ω → n 1 , X 2 : Ω → n 2 ,  X 1 = Y 11= Y 21 , ( 173 )  X 2 = L 1  Y 12 + L 2  Y 22 + L 3  V ,  see  Algorithm   4 , ( 102 ) ; then , ( 174 ) ( Y 1 , Y 2 , X ) ∈ G  ( 0 ,Q s  ( I ) ) , ( Y 11 , Y 12 , Y 13  Y 21 , Y 22 , Y 23   X 1 , X 2) ∈ CI , X 1 ⊆ ( ℱ Y 11  ℱ Y 21 ) , ℱ X 2 ⊆ ( ℱ Y 12  ℱ Y 22 ) ; then  also ,  C W  ( Y 1 , Y 2 ; Δ 1 , Δ 2 ) = I  ( Y 1 , Y 2  ;  X ). ( 175 )

(c) At the encoder side the following operations are made, using (a):

$\begin{matrix}{\mspace{79mu} {{X = \begin{pmatrix}X_{1} \\X_{2}\end{pmatrix}},}} & (176) \\{\mspace{79mu} {{X_{1} = {Y_{11} = Y_{21}}},}} & (177) \\{{X_{2} = {{L_{1}Y_{12}} + {L_{2}Y_{22}} + {L_{3}V}}},{{see}\mspace{14mu} \left( {100,101} \right)\mspace{14mu} {for}\mspace{14mu} {the}\mspace{14mu} {formulas}\mspace{14mu} {of}\mspace{14mu} L_{1,}L_{2}},{L_{3};}} & (178) \\{{\left. \mspace{79mu} {Z_{12} = {Y_{12} - {{E\left\lbrack Y_{12} \right.}\mathcal{F}^{X_{2}}}}} \right\rbrack = {Y_{12} - {Q_{Y_{12},X_{2}}Q_{X_{2}}^{- 1}X_{2}}}},} & (179) \\{\mspace{79mu} {{Z_{22} = {{Y_{22} - {E\left\lbrack Y_{22} \middle| \mathcal{F}^{X_{2}} \right\rbrack}} = {Y_{22} - {Q_{Y_{22},X_{2}}Q_{X_{2}}^{- 1}X_{2}}}}},}} & (180) \\{{Z_{13} = Y_{13}},{Z_{23} = Y_{23}},\left( {{the}\mspace{14mu} {components}\mspace{14mu} Z_{11}\mspace{14mu} {and}\mspace{14mu} Z_{21}\mspace{14mu} {do}\mspace{14mu} {not}\mspace{14mu} {exist}} \right),} & (181) \\{\mspace{79mu} {{Z_{1} = \begin{pmatrix}Z_{12} \\Z_{13}\end{pmatrix}},{Z_{2} = {\begin{pmatrix}Z_{22} \\Z_{23}\end{pmatrix}.}}}} & (182)\end{matrix}$

At the encoder side:

Y ₁₁ =X ₁ =Y ₂₁,  (183)

Y ₁₂ =Z ₁₂ +Q _(Y) ₁₂ _(,X) ₂ Q _(X) ₂ ⁻¹ X ₂ ,Y ₂₂ =Z ₂₂ +Q _(Y) ₂₂_(,X) ₂ Q _(X) ₂ ⁻¹ X ₂,  (184)

Y ₁₃ =Z ₁₃ ,Y ₂₃ =Z ₂₃.  (185)

(d) The relative gain in complexity in terms of dimensions of vectorstransmitted is then:

$\begin{matrix}{g_{d} = {\frac{p_{11} + p_{12}}{p_{1} + p_{2}} = {\frac{p_{11} + p_{12}}{{2\left( {p_{11} + p_{12}} \right)} + p_{13} + p_{23}} \in {\left\lbrack {0,0.5} \right\rbrack.}}}} & (186)\end{matrix}$

In terms of entropy, the same relative gain can be obtained.

As another example, consider the tuple of Gaussian random variables:

$\; {\left( {Y_{1},Y_{2}} \right) \in {G\left( {0,Q_{({Y_{1},Y_{2}})},{p_{1} = 3},{p_{2} = 3},{Q_{({Y_{1},Y_{2}})} = \begin{pmatrix}I_{p_{1}} & Q_{Y_{1},Y_{2}} \\Q_{Y_{1},Y_{2}} & I_{p_{2}}\end{pmatrix}},{Q_{Y_{1},Y_{2}} = {\begin{pmatrix}0.8 & 0 & 0 \\0 & 0.5 & 0 \\0 & 0 & 0.1\end{pmatrix} \in {{\mathbb{R}}^{p_{1} \times p_{2}}.}}}} \right.}}$

A computation then yields:

${\left( {p_{11},p_{12},p_{13}} \right) = \left( {0,3,0} \right)},{\left( {p_{21},p_{22},p_{23}} \right) = \left( {0,3,0} \right)},{D = {\begin{pmatrix}0.8 & 0 & 0 \\0 & 0.5 & 0 \\0 & 0 & 0.1\end{pmatrix} \in {\mathbb{R}}^{p_{12} \times p_{22}}}},{{C_{W}\left( {Y_{1},{Y_{2};\Delta_{1}},\Delta_{2}} \right)} = 5.0444},{{bits}.}$

The gain in complexity is then

$g_{d} = {\frac{0 + 4}{6 + 5} = {{4/11} \approx {0.36.}}}$

As another example, consider the tuple of Gaussian random variables:

$\mspace{20mu} {\left( {Y_{1},Y_{2}} \right) \in {G\left( {0,Q_{({Y_{1},Y_{2}})},{p_{1} = 6},{p_{2} = 5},\mspace{20mu} {Q_{({Y_{1},Y_{2}})} = \begin{pmatrix}I_{p_{1}} & Q_{Y_{1},Y_{2}} \\Q_{Y_{1},Y_{2}} & I_{p_{2}}\end{pmatrix}},{Q_{Y_{1},Y_{2}} = {\begin{pmatrix}0.999998 & 0 & 0 & 0 & 0 \\0 & 0.999992 & 0 & 0 & 0 \\0 & 0 & 0.8 & 0 & 0 \\0 & 0 & 0 & 0.3 & 0 \\0 & 0 & 0 & 0 & 0.000004 \\0 & 0 & 0 & 0 & 0\end{pmatrix} \in {\mathbb{R}}^{p_{1} \times p_{2}}}},} \right.}}$

A computation then yields:

( p 11 , p 12 , p 13 ) = ( 2 , 2 , 2 ) , ( p 21 , p 22 , p 23 ) = ( 2 ,2 , 1 ) , ( 187 ) D = ( 0.8 0 0 0.3 ) ∈ p   12 × p   22 , ( 188 ) CW  ( Y 1 , Y 2 ; Δ 1 , Δ 2 ) = + ∞ , bits , ( 189 ) C W  ( Y 12 , Y 22; Δ 1 , Δ 2 ) = 4.0630 ,  bits . ( 190 )

The gain in complexity in terms of dimension is then

$g_{d} = {\frac{2 + 2}{6 + 5} = {{4/11} \approx {0.36.}}}$

As yet another example, consider a tuple of Gaussian random variablesfor which the covariance matrix is generated by a random numbergenerator. Generate the matrix L E

^(p×p) such that every element has a normal distribution with parametersG(0,1) and such that all elements of the matrix are independent. Thendefine Q=LL^(T) to guarantee that the matrix Q issemi-positive-definite. Then:

(Y ₁ ,Y ₂)∈G(0,Q _((Y) ₁ _(,Y) ₂ ₎),p ₁=5,p ₂=4,p=p ₁ +p ₂=9;

Q_((Y) ₁ _(Y) ₂ ₎, randomly generated as described above, values notdisplayed.

The outcome of a computation is then that:

(p ₁₁ ,p ₁₂ ,p ₁₃)=(0,4,1);(p ₂₁ ,p ₂₂ ,p ₂₃)=(0,4,0);  (191)

C _(W)(Y ₁₂ ,Y ₂₂;Δ₁,Δ₂)=13.1597 bits.  (192)

The gain in complexity in terms of dimensions is then

$g_{d} = {\frac{0 + 4}{5 + 4} = {{4/9} \approx {0.44.}}}$

As still another example, consider a tuple of Gaussian random variablesof which the covariance is generated as in the previous example.

(Y ₁ ,Y ₂)∈G(0,Q _((Y) ₁ _(,Y) ₂ ₎),p ₁=5,p ₂₌4,p=p ₁ +p ₂=9;

Q_((Y) ₁ _(,Y) ₂ ₎, randomly generated as described above.

Then:

(p ₁₁ ,p ₁₂ ,p ₁₃)=(0,4,1);(p ₂₁ ,p ₂₂ ,p ₂₃)=(0,4,0);  (193)

C _(W)(Y ₁₂ ,Y ₂₁;Δ₁,Δ₂)=13.9962 bits.  (194)

The gain in complexity is then

$g_{d} = {\frac{0 + 4}{5 + 4} = {{4/9} \approx {0.44.}}}$

As another example, consider the following time-invariant Gaussiansystem with the indicated partitions. The system is highly structured tomake explicit the various components of the common, correlated, andprivate parts of the system. See also the discussion below the example.

$\begin{matrix}{{{X\left( {t + 1} \right)} = {{\begin{pmatrix}A_{11} & 0 & 0 & 0 \\0 & A_{22} & 0 & 0 \\0 & 0 & A_{33} & 0 \\0 & 0 & \; & A_{44}\end{pmatrix}{X(t)}} + {\begin{pmatrix}I & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\0 & I & 0 & 0 & 0 & 0 & 0 & 0 \\0 & 0 & I & 0 & 0 & 0 & 0 & 0 \\0 & 0 & 0 & I & 0 & 0 & 0 & 0\end{pmatrix}{V(t)}}}},} & (195) \\{{{Y(t)} = {{\begin{pmatrix}C_{11} & 0 & 0 & 0 \\0 & C_{12} & 0 & 0 \\0 & 0 & C_{13} & 0 \\C_{11} & 0 & 0 & 0 \\0 & C_{22} & 0 & 0 \\0 & 0 & 0 & C_{23}\end{pmatrix}{X(t)}} = {\begin{pmatrix}0 & \ldots & 0 & N_{11} & 0 & 0 & 0 \\0 & \ldots & 0 & 0 & N_{12} & 0 & 0 \\0 & \ldots & 0 & 0 & 0 & N_{13} & 0 \\0 & \ldots & 0 & N_{11} & 0 & 0 & 0 \\0 & \ldots & 0 & 0 & N_{22} & 0 & 0 \\0 & \ldots & 0 & 0 & 0 & 0 & N_{23}\end{pmatrix}{V(t)}}}},} & (196) \\{\mspace{79mu} {{n = 8},{m_{v} = 12},{p = 6},{p_{1} = 3},{p_{2} = 3},}} & (197) \\{\mspace{79mu} {{\left( {p_{11},p_{12},p_{13}} \right) = \left( {1,1,1} \right)},{\left( {p_{21},p_{22},p_{23}} \right) = \left( {1,1,1} \right)},{A_{11} = \begin{pmatrix}0 & 1 & 0 \\0 & 0 & 1 \\{- 0.432} & 0.66 & 0.70\end{pmatrix}},{A_{22} = \begin{pmatrix}0 & 1 & 0 \\0 & 0 & 1 \\{- 0.28} & 0.51 & 0.60\end{pmatrix}},{A_{33} = 0.80},{A_{44} = 0.80},}} & (198) \\{{C_{11} = \begin{pmatrix}1 & 0 & 0\end{pmatrix}},{C_{12} = \begin{pmatrix}0 & 0 & 0.8\end{pmatrix}},{C_{22} = \begin{pmatrix}0.6 & 0 & 0\end{pmatrix}},{C_{13} = 1},{C_{23} = 1},} & (199) \\{\mspace{79mu} {{N_{11} = 1},{N_{12} = 1},{N_{13} = 1},{N_{22} = 1},{N_{23} = 1.}}} & (200)\end{matrix}$

A computation with a Matlab computer program yields the results that:

$\begin{matrix}{{\left( {p_{11},p_{12},p_{13}} \right) = \left( {0,2,1} \right)},{\left( {p_{21},p_{22},p_{23}} \right) = \left( {0,2,1} \right)},} & (201) \\{{D = \begin{pmatrix}0.7229 & 0 \\0 & 0.6564\end{pmatrix}},} & (202) \\{{{C\left( {Y_{1},{Y_{2};\Delta_{1}},\Delta_{2}} \right)} = {4.9055\mspace{14mu} {bits}}};} & (203) \\{g_{d} = {\frac{0 + 2}{3 + 3} \approx {33{\%.}}}} & (204)\end{matrix}$

As still another example, consider again a time-invariant Gaussiansystem with the following parameters:

$\mspace{20mu} {{n = {9 = {n_{1} + n_{2} + n_{3}}}},{n_{1} = 6},{n_{1} = 6},{n_{2} = 1},{n_{3} = 2},{m_{v} = 20},\mspace{20mu} {p = {11 = {p_{1} + p_{2}}}},{p_{1} = 5},{p_{2} = 6},\mspace{20mu} {A = {\begin{pmatrix}A_{11} & 0 & 0 \\0 & A_{22} & 0 \\0 & 0 & A_{33}\end{pmatrix} \in {\mathbb{R}}^{9 \times 9}}},{A_{11} = {\begin{pmatrix}0.9 & 0 & 0 & 0 & 0 & 0 \\0 & 0.8 & 0 & 0 & 0 & 0 \\0 & 0 & 0.7 & 0 & 0 & 0 \\0 & 0 & 0 & 0.6 & 0 & 0 \\0 & 0 & 0 & 0 & 0.5 & 0 \\0 & 0 & 0 & 0 & 0 & 0.4\end{pmatrix} \in {\mathbb{R}}^{6 \times 6}}},{A_{22} = {0.8 \in {\mathbb{R}}}},{A_{33} = {\begin{pmatrix}0.9 & 0 \\0 & 0.7\end{pmatrix} \in {\mathbb{R}}^{2 \times 2}}},\mspace{20mu} {C = {\begin{pmatrix}C_{12} & 0 & 0 \\0 & C_{13} & 0 \\C_{22} & 0 & 0 \\0 & 0 & C_{23}\end{pmatrix} \in {\mathbb{R}}^{11 \times 9}}},\mspace{20mu} {C_{12} = {\begin{pmatrix}{1/4} & 0 & {1/4} & 0 & {1/4} & {1/4} \\{1/4} & {1/4} & 0 & {1/4} & 0 & {1/4} \\{1/4} & {1/4} & {1/4} & 0 & {1/4} & 0 \\{1/4} & {1/4} & {1/4} & {1/4} & 0 & 0\end{pmatrix} \in {\mathbb{R}}^{4 \times 6}}},\mspace{20mu} {C_{22} = {\begin{pmatrix}{1/4} & 0 & {1/4} & {1/4} & {1/4} & 0 \\{1/4} & {1/4} & 0 & 0 & {1/4} & {1/4} \\{1/4} & {1/4} & 0 & {1/4} & {1/4} & 0 \\{1/4} & {1/4} & {1/4} & 0 & 0 & {1/4}\end{pmatrix} \in {\mathbb{R}}^{4 \times 6}}},\mspace{20mu} {C_{13} = {1 \in {\mathbb{R}}}},{C_{23} = {I_{2} \in {\mathbb{R}}^{2 \times 2}}},\mspace{20mu} {M = {\begin{pmatrix}I_{9} & 0\end{pmatrix} \in {\mathbb{R}}^{9 \times 20}}},{N = {\begin{pmatrix}0 & I_{11}\end{pmatrix} \in {{\mathbb{R}}^{11 \times 20}.}}}}$

A computation yields that:

$\begin{matrix}{{\left( {p_{11},p_{12},p_{13}} \right)\left( {0,4,1} \right)},{\left( {{p_{21}p_{22}},p_{23}} \right) = \left( {0,4,2} \right)},} & (205) \\{{Q_{({{S_{1}y_{1}},{S_{2}y_{2}}})} = \begin{pmatrix}D & 0 \\0 & 0\end{pmatrix}},{D = {{Diag}(d)}},{d = \begin{pmatrix}0.4781 \\0.0980 \\0.0682 \\0.0039\end{pmatrix}},} & (206) \\{{{C\left( {Y_{1},{Y_{2};\Delta_{1}},\Delta_{2}} \right)} = {1.9940\mspace{20mu} {bits}}},} & (207) \\{g_{d} = {\frac{0 + 4}{11} \approx {36{\%.}}}} & (208)\end{matrix}$

Moreover, although the foregoing text sets forth a detailed descriptionof numerous different embodiments, it should be understood that thescope of the patent is defined by the words of the claims set forth atthe end of this patent. The detailed description is to be construed asexemplary only and does not describe every possible embodiment becausedescribing every possible embodiment would be impractical, if notimpossible. Numerous alternative embodiments could be implemented, usingeither current technology or technology developed after the filing dateof this patent, which would still fall within the scope of the claims.By way of example, and not limitation, the disclosure hereincontemplates at least the following aspects:

1. A computer-implemented method of compressively encoding twocorrelated data vectors, the method comprising: obtaining a first datavector and a second data vector; transforming the first data vector intoa first canonical vector, wherein the first canonical vector includes: afirst component indicative of information in the first data vector andinformation in the second data vector, and a second component indicativeof information in the first data vector and substantially exclusive ofinformation in the second data vector, transforming the second datavector into a second canonical vector, wherein the second canonicalvector includes: a first component indicative of information in thefirst data vector and information in the second data vector, and asecond component indicative of information in the second data vector andsubstantially exclusive of information in the first data vector,generating: (i) a common information vector based on the first componentof the first canonical vector and the first component of the secondcanonical vector, (ii) a first private vector based on the firstcanonical vector and the common information vector, and (iii) a secondprivate vector based on the second canonical vector and the commoninformation vector; compressing the first private vector at a firstprivate rate to generate a first digital message; compressing the secondprivate vector at a second private rate to generate a second digitalmessage; computing an amount of common information included in thecommon information vector; based on the amount of common information,computing a third rate; compressing the common information vector at thethird rate to generate a third digital message; routing the firstdigital message via a first channel, the second digital message via asecond channel and the third digital message via a third channel.

2. The method of the previous aspect, further comprising obtaining atransmission quality requirement, and computing a rate region based atleast in part on the obtained transmission quality requirement.

3. The method of any combination of the preceding aspects, wherein thetransmission quality requirement includes a first distortion level, afirst distortion function, a second distortion level, and a seconddistortion function, and computing the rate region includes: computing alower bound for the first private rate by evaluating a first ratedistortion function for a first average distortion not exceeding thefirst distortion level, and computing a lower bound for the secondprivate rate by evaluating a second rate distortion function for asecond average distortion not exceeding the second distortion level.

4. The method of any combination of the preceding aspects, wherein thetransmission quality requirement further includes a Gray-Wyner lossyrate region.

5. The method of any combination of the preceding aspects, whereincomputing the rate region further includes computing a lower bound forthe sum of the first private rate, the second provide rate, and thecommon rate by evaluating a joint rate distortion function for a firstaverage distortion not exceeding the first distortion level, and asecond average distortion not exceeding the second distortion level.

6. The method of any combination of the preceding aspects, furthercomprising computing a lower bound for the first private rate, a lowerbound for the second private rate, or a lower bound for a sum of thefirst private rate, the second private rate, and the third rate byevaluating one or more rate distortion functions using water fillingtechniques based on the canonical correlation coefficients of the twocanonical vectors or the two data vectors, wherein computing the firstprivate rate, the second private rate, or the third rate is at least inpart based on the lower bound for the first private rate, the lowerbound for the second private rate, or the lower bound for the sum.

7. The method of any combination of the preceding aspects, wherein: therate region is a Gray-Wyner lossy rate region; the first private rate,the second private rate, and the third rate are in the Pangloss plane ofthe Gray-Wyner lossy rate region; and the third rate is a substantiallyprioritized minimum third rate such that a sum of the first privaterate, the second private rate, and the third rate are equal to a jointrate distortion function for the first data vector and the second datavector.

8. The method of any combination of the preceding aspects, furthercomprising obtaining a covariance matrix for the first data vector andthe second data vector, and generating, using the covariance matrix, afirst nonsingular transformation matrix for the first data vector and asecond nonsingular transformation matrix for the second data vector,wherein transforming the first data vector into the first canonicalvector comprises multiplying the first data vector by the firsttransformation matrix, and transforming the second data vector into thesecond canonical vector comprises multiplying the second data vector bythe second transformation matrix.

9. The method of any combination of the preceding aspects, whereinobtaining the covariance matrix comprises analyzing a difference betweenthe first data vector and the second data vector to modify thecovariance matrix.

10. The method of any combination of the preceding aspects, whereincomputing the amount of common information comprises computing an amountof Wyner's common information and Wyner's lossy common information, andcomputing the third rate comprises prioritizing the third rate, andassigning to the third rate the computed amount of Wyner's commoninformation, the computed amount of Wyner's lossy common information, ora minimum common rate on the Gray-Wyner rate region such that the sum ofthe first private rate, the second private rate, and the third rate areequal to a joint rate distortion function for the first data vector andthe second data vector or for the first canonical vector and the secondcanonical vector.

11. The method of any combination of the preceding aspects, furthercomprising implementing one or more test channels, wherein compressingthe first private vector, the second private vector, or the commoninformation vector includes using at least one of the one or more testchannels.

12. The method of any combination of the preceding aspects, wherein theone or more test channels are used to model the rate distortion functionof the first private rate, the rate distortion of the second privaterate, or the joint rate distortion function of the sum of the firstprivate rate, the second private rate, and the third rate.

13. The method of any combination of the preceding aspects, wherein: thefirst canonical vector further comprises a third component; the secondcanonical vector further comprises a third component; the thirdcomponent of the first canonical vector is substantially identical tothe third component of the second canonical vector; and whereingenerating the first private vector excludes the third component of thefirst canonical vector, and generating the second private vectorexcludes the third component of the second canonical vector.

14. The method of any combination of the preceding aspects, whereingenerating the common information vector comprises generating the commoninformation vector such that the common information vector includes thethird component of the first canonical vector or the third component ofthe second canonical vector.

15. The method of any combination of the preceding aspects, wherein thefirst data vector and the second data vector have a different number ofelements.

16. The method of any combination of the preceding aspects, wherein thefirst channel and the second channel are private channels and the thirdchannel is a public channel.

17. The method of any combination of the preceding aspects, whereinrouting the first digital message, routing the second digital message,and routing the third digital message comprises storing the firstdigital message at a first memory location, the second digital messageat a second memory location, and the third digital message at a thirdmemory location of one or more storage media.

18. The method of any combination of the preceding aspects, wherein thefirst memory location and the second memory location are secure memorylocations of the one or more storage media and the third memory locationis an unsecured memory location of the one or more storage media.

19. The method of any combination of the preceding aspects, furthercomprising obtaining a channel capacity of at least one of: i) the firstchannel, ii) the second channel, or iii) the third channel, andcomputing the first private rate, the second private rate, and the thirdrate based at least in part on the obtained channel capacity.

20. The method of any combination of the preceding aspects, furthercomprising determining that the first data vector and the second datavector are substantially representative of correlated Gaussian randomvariables, and generating the first private vector, the second privatevector, and the common information vector in response to determiningthat the first data vector and the second data vector are substantiallyrepresentative of correlated Gaussian random variables.

21. The method of any combination of the preceding aspects, wherein atleast one of (i) the first channel, (ii) the second channel, or (iii)the third channel is a noisy channel comprising a multiple accesschannel, a broadcast channel, or an interference channel.

22. A non-transitory computer-readable medium storing instructions forcompressively encoding two correlated data vectors, wherein theinstructions, when executed by one or more processors of a computingsystem, cause the one or more processors to: obtain a first data vectorand a second data vector; transform the first data vector into a firstcanonical vector, wherein the first canonical vector includes a firstcomponent indicative of information in the first data vector andinformation in the second data vector, and a second component indicativeof information in the first data vector and substantially exclusive ofinformation in the second data vector; transform the second data vectorinto a second canonical vector, wherein the second canonical vectorincludes a first component indicative of information in the first datavector and information in the second data vector, and a second componentindicative of information in the second data vector and substantiallyexclusive of information in the first data vector; generate: (i) acommon information vector based on the first component of the firstcanonical vector and the first component of the second canonical vector,(ii) a first private vector based on the first canonical vector and thecommon information vector, and (iii) a second private vector based onthe second canonical vector and the common information vector; compressthe first private vector at a first private rate to generate a firstdigital message; compress the second private vector at a second privaterate to generate a second digital message; compute an amount of commoninformation included in the common information vector; based on theamount of common information, compute a third rate; compress the commoninformation vector at the third rate to generate a third digitalmessage; route the first digital message via a first channel, the seconddigital message via a second channel and the third digital message via athird channel.

23. The non-transitory computer-readable medium of the previous aspect,wherein the instructions further cause the one or more processors to:obtain a transmission quality requirement, and compute a rate regionbased at least in part on the obtained transmission quality requirement.

24. The non-transitory computer-readable medium of any combination ofthe preceding aspects, wherein: the transmission quality requirementincludes a first distortion level, a first distortion function, a seconddistortion level, and a second distortion function, and to compute therate region, the instructions further cause the one or more processorsto: compute a lower bound for the first private rate by evaluating afirst rate distortion function for a first average distortion notexceeding the first distortion level, and compute a lower bound for thesecond private rate by evaluating a second rate distortion function fora second average distortion not exceeding the second distortion level.

25. The non-transitory computer-readable medium of any combination ofthe preceding aspects, wherein to compute the rate region, theinstructions further cause the one or more processors to compute a lowerbound for the sum of the first private rate, the second provide rate,and the common rate by evaluating a joint rate distortion function forthe first average distortion not exceeding the first distortion level,and the second average distortion not exceeding the second distortionlevel.

26. The non-transitory computer-readable medium of any combination ofthe preceding aspects, wherein the instructions further cause the one ormore processors to: compute a lower bound for a sum of the first privaterate, the second private rate, and the third rate by evaluating one ormore rate distortion functions using water filling techniques based oncovariance or canonical correlation coefficients of the two canonicalvectors or the two data vectors, and to compute the first private rate,the second private rate, or the third rate; wherein the instructionsfurther cause the one or more processors to compute the first privaterate, the second private rate, or the third rate at least in part basedon the lower bound for the sum.

27. The non-transitory computer-readable medium of any combination ofthe preceding aspects, wherein the instructions further cause the one ormore processors to obtain a covariance matrix for the first data vectorand the second data vector, and generate, using the covariance matrix, afirst nonsingular transformation matrix for the first data vector and asecond nonsingular transformation matrix for the second data vector,wherein to transform the first data vector into the first canonicalvector, the instructions further cause the one or more processors tomultiply the first data vector by the first transformation matrix, andto transform the second data vector into the second canonical vector,the instructions further cause the one or more processors to multiplythe second data vector by the second transformation matrix.

28. The non-transitory computer-readable medium of any combination ofthe preceding aspects, wherein to compute the amount of commoninformation, the instructions further cause the one or more processorsto compute an amount of Wyner's common information or Wyner's lossycommon information, and to compute the third rate, the instructionsfurther cause the one or more processors to compute the third rate basedat least in part on the computed amount of Wyner's common informationand Wyner's lossy common information.

What is claimed is:
 1. A computer-implemented method of compressivelyencoding two correlated data vectors, the method comprising obtaining afirst data vector and a second data vector; transforming the first datavector into a first canonical vector, wherein the first canonical vectorincludes: a first component indicative of information in the first datavector and information in the second data vector, and a second componentindicative of information in the first data vector and substantiallyexclusive of information in the second data vector, transforming thesecond data vector into a second canonical vector, wherein the secondcanonical vector includes: a first component indicative of informationin the first data vector and information in the second data vector, anda second component indicative of information in the second data vectorand substantially exclusive of information in the first data vector,generating: (i) a common information vector based on the first componentof the first canonical vector and the first component of the secondcanonical vector, (ii) a first private vector based on the firstcanonical vector and the common information vector, and (iii) a secondprivate vector based on the second canonical vector and the commoninformation vector; compressing the first private vector at a firstprivate rate to generate a first digital message; compressing the secondprivate vector at a second private rate to generate a second digitalmessage; computing an amount of common information included in thecommon information vector; based on the amount of common information,computing a third rate; compressing the common information vector at thethird rate to generate a third digital message; routing the firstdigital message via a first channel, the second digital message via asecond channel and the third digital message via a third channel.
 2. Themethod of claim 1, further comprising: obtaining a transmission qualityrequirement, and computing a rate region based at least in part on theobtained transmission quality requirement.
 3. The method of claim 2,wherein: the transmission quality requirement includes a firstdistortion level, a first distortion function, a second distortionlevel, and a second distortion function, and computing the rate regionincludes: computing a lower bound for the first private rate byevaluating a first rate distortion function for a first averagedistortion not exceeding the first distortion level, and computing alower bound for the second private rate by evaluating a second ratedistortion function for a second average distortion not exceeding thesecond distortion level.
 4. The method of claim 3, wherein thetransmission quality requirement further includes a Gray-Wyner lossyrate region.
 5. The method of claim 3, wherein: computing the rateregion further includes computing a lower bound for the sum of the firstprivate rate, the second provide rate, and the common rate by evaluatinga joint rate distortion function for a first average distortion notexceeding the first distortion level, and a second average distortionnot exceeding the second distortion level.
 6. The method of claim 3,further comprising: computing a lower bound for the first private rate,a lower bound for the second private rate, or a lower bound for a sum ofthe first private rate, the second private rate, and the third rate byevaluating one or more rate distortion functions using water fillingtechniques based on the canonical correlation coefficients of the twocanonical vectors or the two data vectors, wherein: computing the firstprivate rate, the second private rate, or the third rate is at least inpart based on the lower bound for the first private rate, the lowerbound for the second private rate, or the lower bound for the sum. 7.The method of claim 2, wherein: the rate region is a Gray-Wyner lossyrate region, the first private rate, the second private rate, and thethird rate are in the Pangloss plane of the Gray-Wyner lossy rateregion, and the third rate is a substantially prioritized minimum thirdrate such that a sum of the first private rate, the second private rate,and the third rate are equal to a joint rate distortion function for thefirst data vector and the second data vector.
 8. The method of claim 1,further comprising: obtaining a covariance matrix for the first datavector and the second data vector, and generating, using the covariancematrix, a first nonsingular transformation matrix for the first datavector and a second nonsingular transformation matrix for the seconddata vector, wherein: transforming the first data vector into the firstcanonical vector comprises multiplying the first data vector by thefirst transformation matrix, and transforming the second data vectorinto the second canonical vector comprises multiplying the second datavector by the second transformation matrix.
 9. The method of claim 8,wherein obtaining the covariance matrix comprises: analyzing adifference between the first data vector and the second data vector tomodify the covariance matrix.
 10. The method of claim 1, wherein:computing the amount of common information comprises computing an amountof Wyner's common information and Wyner's lossy common information, andcomputing the third rate comprises assigning to the third rate thecomputed amount of Wyner's common information, the computed amount ofWyner's lossy common information, or a minimum common rate on theGray-Wyner rate region such that the sum of the first private rate, thesecond private rate, and the third rate are equal to a joint ratedistortion function for the first data vector and the second data vectoror for the first canonical vector and the second canonical vector. 11.The method of claim 1, further comprising: implementing one or more testchannels, wherein: compressing the first private vector, the secondprivate vector, or the common information vector includes using at leastone of the one or more test channels.
 12. The method of claim 11,wherein: the one or more test channels are used to model the ratedistortion function of the first private rate, the rate distortion ofthe second private rate, or the joint rate distortion function of thesum of the first private rate, the second private rate, and the thirdrate.
 13. The method of claim 1, wherein: the first canonical vectorfurther comprises a third component; the second canonical vector furthercomprises a third component; the third component of the first canonicalvector is substantially identical to the third component of the secondcanonical vector; and wherein: generating the first private vectorexcludes the third component of the first canonical vector, andgenerating the second private vector excludes the third component of thesecond canonical vector.
 14. The method of claim 13, wherein generatingthe common information vector comprises: generating the commoninformation vector such that the common information vector includes thethird component of the first canonical vector or the third component ofthe second canonical vector.
 15. The method of claim 1, wherein thefirst data vector and the second data vector have a different number ofelements.
 16. The method of claim 1, wherein the first channel and thesecond channel are private channels and the third channel is a publicchannel.
 17. The method of claim 1, wherein routing the first digitalmessage, routing the second digital message and routing the thirddigital message comprises: storing the first digital message at a firstmemory location, the second digital message at a second memory location,and the third digital message at a third memory location of one or morestorage media.
 18. The method of claim 17, wherein: the first memorylocation and the second memory location are secure memory locations ofthe one or more storage media and the third memory location is anunsecured memory location of the one or more storage media.
 19. Themethod of claim 1, further comprising: obtaining a channel capacity ofat least one of: i) the first channel, ii) the second channel, or iii)the third channel, and computing the first private rate, the secondprivate rate, and the third rate based at least in part on the obtainedchannel capacity.
 20. The method of claim 1, further comprisingdetermining that the first data vector and the second data vector aresubstantially representative of correlated Gaussian random variables,and generating the first private vector, the second private vector, andthe common information vector in response to determining that the firstdata vector and the second data vector are substantially representativeof correlated Gaussian random variables.
 21. The method of claim 1,wherein: at least one of (i) the first channel (ii) the second channel,or (iii) the third channel is a noisy channel comprising a multipleaccess channel, a broadcast channel, or an interference channel.
 22. Anon-transitory computer-readable medium storing instructions forcompressively encoding two correlated data vectors, wherein theinstructions, when executed by one or more processors of a computingsystem, cause the one or more processors to: obtain a first data vectorand a second data vector; transform the first data vector into a firstcanonical vector, wherein the first canonical vector includes: a firstcomponent indicative of information in the first data vector andinformation in the second data vector, and a second component indicativeof information in the first data vector and substantially exclusive ofinformation in the second data vector, transform the second data vectorinto a second canonical vector, wherein the second canonical vectorincludes: a first component indicative of information in the first datavector and information in the second data vector, and a second componentindicative of information in the second data vector and substantiallyexclusive of information in the first data vector, generate: (i) acommon information vector based on the first component of the firstcanonical vector and the first component of the second canonical vector,(ii) a first private vector based on the first canonical vector and thecommon information vector, and (iii) a second private vector based onthe second canonical vector and the common information vector; compressthe first private vector at a first private rate to generate a firstdigital message; compress the second private vector at a second privaterate to generate a second digital message; compute an amount of commoninformation included in the common information vector; based on theamount of common information, compute a third rate; compress the commoninformation vector at the third rate to generate a third digitalmessage; route the first digital message via a first channel, the seconddigital message via a second channel and the third digital message via athird channel.
 23. The non-transitory computer-readable medium of claim22, wherein the instructions further cause the one or more processorsto: obtain a transmission quality requirement, and compute a rate regionbased at least in part on the obtained transmission quality requirement.24. The non-transitory computer-readable medium of claim 23, wherein:the transmission quality requirement includes a first distortion level,a first distortion function, a second distortion level, and a seconddistortion function, and to compute the rate region, the instructionsfurther cause the one or more processors to: compute a lower bound forthe first private rate by evaluating a first rate distortion functionfor a first average distortion not exceeding the first distortion level,and compute a lower bound for the second private rate by evaluating asecond rate distortion function for a second average distortion notexceeding the second distortion level.
 25. The non-transitorycomputer-readable medium of claim 24, wherein to compute the rateregion, the instructions further cause the one or more processors to:compute a lower bound for a sum of the first private rate, the secondprovide rate, and the common rate by evaluating a joint rate distortionfunction for the first average distortion not exceeding the firstdistortion level, and the second average distortion not exceeding thesecond distortion level.
 26. The non-transitory computer-readable mediumof claim 24, wherein the instructions further cause the one or moreprocessors to: compute a lower bound for a sum of the first privaterate, the second private rate, and the third rate by evaluating one ormore rate distortion functions using water filling techniques based oncovariance or canonical correlation coefficients of the two canonicalvectors or the two data vectors, and to compute the first private rate,the second private rate, or the third rate, the instructions furthercause the one or more processors to compute the first private rate, thesecond private rate, or the third rate at least in part based on thelower bound for the sum.
 27. The non-transitory computer-readable mediumof claim 22, wherein the instructions further cause the one or moreprocessors to: obtain a covariance matrix for the first data vector andthe second data vector, and generate, using the covariance matrix, afirst nonsingular transformation matrix for the first data vector and asecond nonsingular transformation matrix for the second data vector,wherein: to transform the first data vector into the first canonicalvector, the instructions further cause the one or more processors tomultiply the first data vector by the first nonsingular transformationmatrix, and to transform the second data vector into the secondcanonical vector, the instructions further cause the one or moreprocessors to multiply the second data vector by the second nonsingulartransformation matrix.
 28. The non-transitory computer-readable mediumof claim 22, wherein: to compute the amount of common information, theinstructions further cause the one or more processors to compute anamount of Wyner's common information or Wyner's lossy commoninformation, and to compute the third rate, the instructions furthercause the one or more processors to compute the third rate based atleast in part on the computed amount of Wyner's common information andWyner's lossy common information.